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Recommended Books and Resources 


There are many good books on vector calculus that will get you up to speed on 
the basic ideas, illustrated with an abundance of examples. 


e H.M Schey, “Div, Grad, Curl, and all That? 
e Jerrold Marsden and Anthony Tromba, “Vector Calculus" 


Schey develops vector calculus hand in hand with electromagnetism, using Maxwell's 
equations as a vehicle to build intuition for differential operators and integrals. Marsden 
and Tromba is a meatier book but the extra weight is because it goes slower, not further. 
Neither of these books cover much (if any) material that goes beyond what we do in 
lectures. In large part this is because the point of vector calculus is to give us tools 
that we can apply elsewhere and the next steps involve turning to other courses. 


e Baxandall and Liebeck, “Vector Calculus” 


This book does things differently from us, taking a more rigorous and careful path 
through the subject. For the most part, this involves being more careful from the off-set 
about what spaces different objects live in. All of this will be treated in later courses, 
but if you're someone who likes all their 2’s dotted, €'s small, and h’s uncrossed, then 
this is an excellent place to look. 
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0 Introduction 


The development of calculus was a watershed moment in the history of mathematics. 
In its simplest form, we start with a function 


f:R-R 


Provided that the function is continuous and smooth, we can do some interesting things. 
We can differentiate. And integrate. It’s hard to overstate the importance of these 
operations. It’s no coincidence that the discovery of calculus went hand in hand with 
the beginnings of modern science. It is, among other things, how we describe change. 


The purpose of this course is to generalise the concepts of differentiation and inte- 
gration to functions, or maps, of the form 


f: R” >R” (0.1) 


with m and n positive integers. Our goal is simply to understand the different ways 
in which we can differentiate and integrate such functions. Because points in R” and 
R” can be viewed as vectors, this subject is called vector calculus. It also goes by the 
name of multivariable calculus. 


The motivation for extending calculus to maps of the kind (0.1) is manifold. First, 
given the remarkable depth and utility of ordinary calculus, it seems silly not to explore 
such an obvious generalisation. As we will see, the effort is not wasted. There are 
several beautiful mathematical theorems awaiting us, not least a number of important 
generalisations of the fundamental theorem of calculus to these vector spaces. These 
ideas provide the foundation for many subsequent developments in mathematics, most 
notably in geometry. They also underlie every law of physics. 


Examples of Maps 


To highlight some of the possible applications, here are a few examples of maps (0.1) 
that we will explore in greater detail as the course progresses. Of particular interest 
are maps 


f:R> R” (0.2) 


These define curves in R”. A geometer might want to understand how these curves 
twist and turn in the higher dimensional space or, for n = 3, how the curve ties itself 
in knots. For a physicist, maps of this type are particularly important because they 
describe the trajectory of a particle. Here the codomain R” is identified as physical 
space, an interpretation that is easiest to sell when n = 3 or, for a particle restricted 
to move on a plane, n = 2. 


Figure 1. On the left, the temperature on the surface of the Earth is an example of a map 
from R? > R, also known as a scalar field. On the right, the wind on the surface of the Earth 
blows more or less horizontally and so can be viewed as a map from IR? — R?, also known as 
a vector field. (To avoid being co-opted by the flat Earth movement, I should mention that, 
strictly speaking, each of these is a map from S? rather than R?.) 


Before we go on, it will be useful to introduce some notation. We'll parameterise R 
by the variable t. Meanwhile, we denote points in R” as x. A curve (0.2) in R” is then 
written as 


f:tx(t) 


Here x(t) is the image of the map. But, in many situations below, we'll drop the f 
and just refer to x(t) as the map. For a physicist, the parameter t is usually viewed as 
time. In this case, repeated differentiation of the map with respect to t gives us first 
velocity, and then acceleration. 


Going one step further, we could consider maps f : IR? — R” as defining a surface 
in R”. Again, a geometer might be interested in the curvature of this surface and 
this, it turns out requires an understanding of how to differentiate the maps. There are 
then obvious generalisations to higher dimensional surfaces living in higher dimensional 
spaces. 


From the physics perspective, in the map (0.2) that defines a curve the codomain 
IR” is viewed as physical space. A conceptually different set of functions arise when we 
think of the domain R” as physical space. For example, we could consider maps of the 
kind 


f: ROR 


where R3 is viewed as physical space. Physicists refer to this as a scalar field. (Math- 
ematicians refer to it as a map from R? to R.) A familiar example of such a map is 
temperature: there exists a temperature at every point in this room and that gives 
a map T(x). This is shown in Figure 1. A more fundamental, and ultimately more 
interesting, example of a scalar field is the Higgs field in the Standard Model of particle 
physics. 


As one final example, consider maps of the form 
f: R? m 


where, again, the domain R? is identified with physical space. Physicists call these 
vector fields. (By now, you can guess what mathematicians call them.) In fundamental 
physics, two important examples are provided by the electric field E(x) and magnetic 
field B(x), first postulated by Michael Faraday: each describes a three-dimensional 
vector associated to each point in space. 


1 Curves 


In this section, we consider maps of the form 
f:Rom 


A map of this kind is called a parameterised curve, with R the parameter and the curve 
the image of the map in R”. In what follows, we will denote the curve as C. 


Whenever we do explicit calculations, we need to introduce some coordinates. The 
obvious ones are Cartesian coordinates, in which the vector x € R” is written as 


x= (x',..., 2") = xe; 
where, in the second expression, we're using summation convention and explicitly sum- 
ming over i = 1,...,n. Here {e;} is a choice of orthonormal basis vectors, satisfying 
e;:e; = oj. For R” = R?, we'll also write these as {e;} = (X. y, Z). (The notation 
tei) = {i,j,k} is also standard, although we won't adopt it in these lectures.) 


The image of the function can then be written as x(t). In physics, we might think 
of this as the trajectory of a particle evolving in time t. Here, we'll mostly just view 
the curve as an abstract mathematical map, with t nothing more than a parameter 
labelling positions along the curve. In fact, one of themes of this section is that, for 
many calculations, the choice of parameter t is irrelevant. 


Examples 


Here are two simple examples. Consider first 
the map R > R? that takes the form 


x(t) = (at, bt?, 0) 


The image of the map is the parabola a?y = 
bx”, lying in the plane z = 0, and is shown on 
the right. 


This looks very similar to what you would 
draw if asked to plot the graph y = bz?/a?, 
with the additional requirement of z = 0 prompting the artistic flourish that results in 
a curve suspended in 3d. Obviously, the curve x(t) and the functions y = bx?/a? (with 
z = 0) are related, but they're not quite the same thing. The function y = bz?/a? 
is usually thought of as a map R — R and in plotting a graph you include both the 
domain and codomain. In contrast, on the right we've plotted only the image of the 
curve x(t) in R3; the picture loses all information about the domain coordinate t. 


Here is a second example that illustrates the 
same point. Consider 


x(t) = (cost, sint, t) (1.1) 


The resulting curve is a helix, shown to the eh 
right. Like any other curve, the choice of pa- 

rameterisation is not unique. We could, for 

example, consider the different map 


x(t) — (cos At, sin At, At) 


This gives the same helix as (1.1) for any choice of A € R as long as A # 0. In 
some contexts this matters. If, for example, t is time, and x(t) is the trajectory of a 
rollercoaster then the fate of the contents of your stomach depends delicately on the 
value of A. However, there will be some properties of the curve that are independent 
of the choice of parameterisation and, in this example, independent of A. It is these 
properties that will be our primary interest in this section. 


Before we go on, a pedantic mathematical caveat. It may be that the domain of 
the curve is not all of R. For example, we could have the map R — IR? given by 
x(t) = (t, V1 — t?). This makes sense only for the interval t € |-1, +1] and you should 
proceed accordingly. 


1.1 Differentiating the Curve 


The vector function x(t) is differentiable at t if, as dt — 0, we can write 
x(t + dt) — x(t) = x(t) ôt + O(6t?) (1.2) 


You should think of this expression as defining the derivative x(t). If the derivative x 
exists everywhere then the curve is said to be smooth. This means that it is continuous 
and, as the name suggests, not egregiously jagged. 


There are some notational issues to unpick in this expression. First, O(dt?) includes 
all terms that scale as ôt? or smaller as ôt — 0. This “big-O” notation is commonly 
used in physics and applied mathematics. In pure maths you will also see the “little 
o" notation o(ót) which means “strictly smaller than ôt” as dt — 0. Roughly speaking 
o(dt) is the same thing as O(6t?). (In other courses you may encounter situations where 
this speaking is too rough to be accurate, but it will suffice for our needs.) We'll stick 
with big-O notation throughout these lectures. 


We've denoted the derivative in (1.2) with a dot, x(t). This was Newton’s original 
notation for the derivative and, 350 years later, comes with some sociological baggage. 
In physics, a dot is nearly always used to denote differentiation with respect to time, so 
the velocity of a particle is x and the acceleration is X. Meanwhile a prime, like f'(x), 
is usually used to denote differentiation with respect to space. This is deeply ingrained 
in the psyche of physicists, so much so that I get a little shudder if I see something 
like z'(t), even though it's perfectly obvious that it means dx/dt. Mathematicians, 
meanwhile, seem to have no such cultural hang-ups on this issue. (They reserve their 
cultural hang-ups for a 1000 other issues.) 


We write the left-hand side of (1.2) as 
óx(t) = x(t + dt) — x(t) 


The derivative is then the vector 
dx E . OX 


q HU cdm 


Here the familiar notation dx/dt for the derivative is due to Leibniz and works if we're 
differentiating with respect to time, space, or anything else. We'll also sometimes use 
the slightly sloppy notation and write 


dx = x dt 


which, at least for now, really just means the same thing as (1.2) except we've dropped 
the O(6t?) terms. 


It’s not difficult to differentiate vectors and, at least in Cartesian coordinates with 
the basis vectors e;, we can just do it component by component 


x(t) 2a'(t)e; =>  x(t)-—i'(t)e; 


The same is true if we work in any other choice of basis vectors {e;} provided that these 
vectors themselves are independent of t. (In the lectures on Dynamics and Relativity 
we encounter an example where the basis vectors do depend on time and you have to 
be more careful. This arises in Section 6 on “Non-Inertial Frames".) 


More generally, given a function f(t) and two vector functions g(t) and h(t), it’s 
simple to check that the following Leibniz identities hold 


d df | dg 
pu ) get u 

d dg dh 

a h) q HEN 


Figure 2. The derivative is the tangent vector to the curve. 


Moreover, if g(t) and h(t) are vectors in R3, we also have the cross-product identity 


d dg 

ae x = 
As usual, we have to be careful with the ordering of terms in the cross product because 
for example, dg/dt x h = —h x dg/dt. 


1.1.1 Tangent Vectors 


There is a nice geometric meaning to the derivative x(t) of a parameterised curve C: it 
gives the tangent to the curve and is called, quite reasonably, the tangent vector. This 
is shown in Figure 2. 


The direction of the tangent vector x(t) is geometrical (at least up to a sign): it 
depends only on the curve C itself, and not on the choice of parameterisation. In con- 
trast, the magnitude of the tangent vector |x(t)| does depend on the parameterisation. 
This is obvious mathematically, since we're differentiating with respect to t, and also 
physically where x is identified with the velocity of a particle. 


Sometimes, you may find yourself with an unwise choice of parameterisation in which 
the derivative vector x vanishes at some point. For example, consider the curve in IR? 
given by 


x(t) mte) 


The curve C is just the straight line x = y. The tangent vector i = 3¢?(1,1) which 
clearly points along the line x = y but with magnitude 3/21? and so vanishes at t = 0. 
Clearly this is not a property of C itself, but of our choice of parameterisation. We get 
the same curve C from the map x(t) — (t,t) but now the tangent vector is everywhere 
non-vanishing. 


A parameterisation is called regular if x(t) Z 0 for any t. In what follows, we will 
assume that we are dealing with regular parameterisations except, perhaps, at isolated 
points. This means that we can divide the curve into segments, each of which is regular. 


As a slightly technical aside, we will sometimes 
have cause to consider curves that are piecewise 
smooth curves of the form C = C,+C2+..., where 
the end of one curve lines up with the beginning of 
the next, as shown on the right. In this case, a tan- 


gent vector exists everywhere except at the cusps 
where two curves meet. 

1.1.2 The Arc Length 

We can use the tangent vectors to compute the length of the curve. From Figure 2, we 
see that the distance between two nearby points is 


ôs = [óx| + O(lóx|?) = |x ót| + O(ót?) 


We then have 
dx 


dt 


ds 


== = +}x| (1.3) 


where we get the plus sign for distances measured in the direction of increasing t, and 
the minus sign in the direction of decreasing t. 


If we pick some starting point tọ on the curve, then to 
the distance along the curve to any point t > to is given F Xo» 
by 
t 
= f dt! l&(t^)| 
to 


This distance is called the arc length, s. Because |x| > 0, this is a positive and strictly 
increasing function as we move away in the direction t > to. It is a negative, and 
strictly decreasing function in the direction t < to. 


Although the tangent vector x depends on the choice of parameterisation, the arc 
length s does not. We can pick a different parameterisation of the curve (t), which we 
will take to be an invertible and smooth function. We will also assume that dr/dt > 0 


— 10- 


so that they both measure "increasing time" in the same direction. The chain rule tells 
us that 


dx dx dr 
usps Ld 1.4 
dt dr dt s) 
We can then compute the arc length using the 7 parameterisation: it is 
f d dt | dx dr! g dx 
= | d x(t = | dr = | dr |— 1.5 
i [ Re) L * dr (dr dt l "lar P 


In the second equality, we find the contribution from the chain rule (1.4) together with 
a factor from the measure that comes from integrating over dr instead of dt. These then 
cancel in the third equality. The upshot is that we can compute the arc length using 
any parameterisation that we wish. Or, said differently, the arc length is independent 
of the choice of parameterisation of the curve. 


We can now turn this on its head. All parameterisations of the curve give the same 
arc length. But this means that the arc length itself is, in many ways, the only natural 
parameterisation of the curve. We can then think of x(s) with the corresponding 
tangent vector dx/ds. From (1.3), we see that this choice of the tangent vector always 
has unit length: |dx/ds| — 1. 


As an aside: these kind of issues raise their head in the physics of special relativity 
where time means different things for people moving at different speeds. This means 
that there is no universally agreed “absolute time" and so different people will parame- 
terise the trajectory of a particle x(t) in different ways. There's no right or wrong way, 
but it's annoying if someone does it differently to you. (Admittedly, this is only likely 
to happen if they are travelling at an appreciable fraction of the speed of light relative 
to you.) Happily there is something that everyone can agree on, which is the special 
relativistic version of arc length. It's known as proper time. You can read more about 
this in the lectures on Dynamics and Relativity. 


An Example 


To illustrate these ideas, let's return to our helix example of (1.1). We had x(t) — 
(cost, sin t£, t) and so x(t) = (— sint, cost, 1). Our defining equation (1.3) then becomes 
(taking the positive sign) 


ds 
a Ix| = v2 
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If we take tọ = 0, then the arc length measured 
from the point x = (1,0,0) is s = 2t. In particular, 
after time t = 27 we've made a full rotation and sit at 
x = (1,0,27). These two points are shown as red dots 
in the figure. Obviously the direct route between the 
two has distance 27. Our analysis above shows that the 
distance along the helix is s = V/87. 


1.1.3 Curvature and Torsion 


There is a little bit of simple geometry associated to 
these ideas. Given a curve C, parameterised by its arc length s, we have already seen 
that the tangent vector 


+ 
ds 


has unit length, |t| = 1. (Note: don’t confuse the bold faced tangent vector t with 
our earlier parameterisation t: they're different objects! We can also consider the 
“acceleration” of the curve with respect to the arc length, d?x/ds?. The magnitude of 
this “acceleration” is called the curvature 


d?x 
ds? 


K(s) = (1.6) 


To build some intuition, we can calculate the curvature of a circle of radius R. If 
we start with a simple parameterisation x(t) = (Rcost, Rsint) then you can check 
using (1.3) that the arc length is s = Rt. We then pick the new parameterisation 
x(s) = (Rcos(s/ R), Rsin(s/ R)). We then find that a circle of radius R has constant 


curvature 


1 
c= 


R 


Note, in particular, that as R — oo, the circle becomes a straight line which has 
vanishing curvature. 


There is also a unit vector associated to this “acceleration”, defined as 
ldx ldt 
n = ——— = OU — 
kds? kds 


This is known as the principal normal. Note that the factor of 1/« ensures that |n| = 1. 
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Importantly, if k # 0 then n is perpendicular to the tangent vector t. This follows 
from the fact that t - t = 1 and so d/ds(t - t) = 2&n- t = 0. This means that t and n 
define a plane, associated to every point in the curve. This is known as the osculating 
plane. 


For any point s on the curve, there is an associ- 
ated osculating plane. Now draw a circle in that plane 
that touches the curve at the point s, whose curvature 
matches «(s). This is called the osculating circle and is 
shown in green in the figure. This is the circle that just 


kisses the curve at s b 


Next we can ask: how does the osculating plane vary 
as we move along the curve? This is simplest to discuss 
if we restrict to curves in R?. In this case, we have the cross product at our disposal 
and we can define the unit normal to the osculating plane as 


b=txn 


This is known as the binormal, to distinguish it from the normal n. The three vectors 
t, n and b define an orthonormal basis for IR? at each point s along the curve (at least 
as long as &(s) #0.) This basis twists and turns along the curve. 


Note that |b| — 1 which, using the same argument as for t above, tells us that 
b - db/ds = 0. In addition, we have t - b = 0 which, after differentiating, tells us that 
dt db db 
= piles =kn-b+t. — 
: ds T mcm E ds 
But, by definition, n - b = 0. So we learn that t - db/ds = 0. In other words, db/ds is 
orthogonal to both b and to t. Which means that it must be parallel to n. We define 
the torsion T(s) as a measure of how the binormal changes 


db 
T —T(s)n (1.7) 


From the definition, you can see that the torsion is a measure of X. The minus sign 
means that if the top of the green circle in the figure tilts towards us, then 7 > 0; if 
it tilts away from us then 7 « 0. Heuristically, the curvature captures how much the 
curve fails to be a straight line, while the torsion captures how much the curve fails to 
be planar. 
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The Frenet-Serret Equations 


There is a closed set of formulae describing curvature and torsion. These are the 
Frenet-Serret equations, 


dt 


Ric 1. 

g7 (1.8) 
- = —7n (1.9) 
“ = Tb— kt (1.10) 


The first of these (1.8) is simply the definition of the normal n. 


That leaves us with (1.10). We'll again start with the definition b = t x n, and this 
time take the cross product with t. The triple product formula then gives us 


bxt=(txn)xt=(t-t)n—(n-t)t=n 


Now taking the derivative with respect to s, using (1.8) and (1.9) and noting that 
b=t x n and t =n x b then gives us (1.10). 


It's useful to rewrite the first two equations (1.8) and (1.9) using n — b x t so that 
we have 
dt db 


g ^ NAb x t) and qe Py 


This is six first order equations for six unknowns, b(s) and t(s). If we are given &(s) 
and 7(s), together with initial conditions b(0) and t(0), then we can solve for b(s) and 
t(s) and can subsequently solve for the curve x(s). The way to think about this is 
that the curvature and torsion &(s) and 7(s) specify the curve, up to translation and 
orientation. 


1.2 Line Integrals 


Given a curve C in R” and some function defined over IR", we may well wish to integrate 
the function along the curve. There are different stories to tell for scalar and vector 
fields and we deal with each in turn. 


1.2.1 Scalar Fields 


A scalar field is a map 
ó:R"—R 


With coordinates x on R”, we'll denote this scalar field as ó(x). 
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Given a parameterised curve C in R”, which we denote as x(t), it might be tempting 
to put these together to get the function ó(x(t)) which is a composite map R — R. 
We could then just integrate over t in the usual way. 


However, there's a catch. The result that you get will depend both on the function 
@, the curve C', and the choice of parameterisation of the curve. There's nothing wrong 
this per se, but it's not what we want here. For many purposes, it turns out to be more 
useful to have a definition of the integral that depends only on the function ¢@ and the 
curve C, but gives the same answer for any choice of parameterisation of the curve. 


One way to achieve this is to work with the arc length s which, as we've seen, is the 
natural parameterisation along the curve. We can integrate from point a to point b, 
with x(s,) = a and x(s,) = b and s, < Sp, by defining the line integral 


[os [ eon as 


where the right-hand side is now viewed as a usual one-dimensional integral. 


This line integral is, by convention, defined so that Ja ds gives the length of the 
curve C and, in particular, is always positive. In other words, there's no directional 
information in this integral: it doesn't matter what way you move along the curve. 


Suppose that we're given a parameterised curve C in terms of some other parameter 
x(t), with x(t,) = a and x(t,) = b. The usual change of variables tells us that 


[257 [ om Gat 


We can then use (1.3). If tj > ta then we have ds/dt = +|x| and 


to 
f oc [ en ton at (1.11) 
C ta 
Meanwhile, if tẹ < ta then we have ds/dt = —|x| and 
ta 
f e [een a 
C to 
We sce that the line integral comes with the length of the tangent vector |x| in the 


integrand. This is what ensures that the line integral is actually independent of the 
choice of parameterisation: the argument is the same as the one we used in (1.5) to 


—]15- 


show that the arc length is invariant under reparameterisations: upon a change of 
variables, the single derivative d/dt in x cancels the Jacobian from the integral f dt. 
Furthermore, the minus signs work out so that you're always integrating from a smaller 
value of t to a larger one, again ensuring that f c ds is positive and so can be interpreted 
as the length of the curve. 


1.2.2 Vector Fields 


Vector fields are maps of the form 
F:R"—R* 


So that at each point x € IR" we have a vector-valued object F(x). We would like to 
understand how to integrate a vector field along a curve C. 


There are two ways to do this. We could work component-wise, treating each compo- 
nent like the scalar field example above. After doing the integration, this would leave 
us with a vector. 


However it turns out that, in many circumstances, it's more useful to integrate the 
vector field so that the integral gives us just a number. We do this integrating the 
component of the vector field that lies tangent to the curve. Usually, this is what is 
meant by the line integral of a vector field. 


In more detail, suppose that our curve C has a parameterisation x(t) and we wish 
to integrate from t, to ty, with x(t,) = a and x(t,) = b. The line integral of a vector 
field F along C is defined to be 


[Ed [ Festo) st) dt (1.12) 
C ta 


Once again, this doesn’t depend on the choice of parameterisation t. This is manifest in 
the expression on the left where the parameterisation isn't mentioned. The right-hand 
side is invariant for the same reason as (1.11). 


This time, however, there's a slightly different story to tell about minus signs. We 
should think of each curve C as coming with an orientation, which is the direction along 
the curve. Equivalently, it can be thought of as the direction of the tangent vector x. 
In the example above, the orientation of the curve is from a to b. This then determines 
the limits of the integral, from t, to tp, since x(t,) = a and x(t,) = b. Note that the 
limits are always this way round, regardless of whether our parameterisation has ta < ty 
or whether t, > ta: the orientation determines the limits, not the parameterisation. 
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In summary, the line integral for a scalar field f c 6 ds is independent of the orientation 
and, if ¢ is positive, the integral will also be positive. In contrast, the integral of the 
vector field f c FK: xdt depends on the orientation. Flip the orientation of the curve, 
and the integral will change sign. 


An Example 


As a slightly baroque example, consider the vector field in IR?, 
F(x) = (xe*, z”, ry) 


To evaluate the line integral, we also need to 
specify the curve C along which we perform x = (1,1,1) 
the integral. We'll consider two options, both 
of which evolve from x(t — 0) — (0,0,0) to 
x(t = 1) = (1,1,1). Our first curve is 


Ci: x(t) = (t,t, t) 
x = (0,0,0) 
This is shown in the figure. Evaluated on C1, 
we have F(x(t)) = (te’ , 19,13). Meanwhile x = (1, 2t, 3t?) so we have 


1 
] Fix f nns 
Ci 0 


1 
1 
= f dt (te^ + 2t" + an) ex (1 + 2e) 
0 
Our second curve is simply the straight line 
C, : x(t) = (0,5) 


Evaluated on this curve, we have F(x(t)) = (tef, ?,1?). Now the tangent vector is 
i = (1, 1, 1) and the integral is 


1 1 
5 
] ra=] arx-[ dt (te! 28) = = (1.13) 
C2 0 0 3 


(The first of these integrals is done by an integration by parts.) 


The main lesson to take from this is the obvious one: the answers are different. The 
result of a line integral generally depends on both the thing you’re integrating F and 
the choice of curve C. 


es 


C2 


Ci 


Figure 3. Decomposing a curve by introducing new segments with opposite orientations. 


More Curves, More Integrals 


We'll see plenty more examples of line integrals, both in this course and in later ones. 
Here are some comments to set the scene. 


First, there will be occasions when we want to perform a line integral around a 
closed curve C, meaning that the starting and end points are the same, a — b. For 
such curves, we introduce new notation and write the line integral as 


fre 
C 


with the little circle on the integral sign there to remind us that we're integrating 
around a loop. This quantity is called the circulation of F around C. The name comes 
from Fluid Mechanics where we might view F as the velocity field of a fluid, and the 
circulation quantifies the swirling motion of the fluid. 


In other occasions, we may find ourselves in a situation in which the curve C decom- 
poses into a number of piecewise smooth curves C;, joined up at their end points. We 
write C = C1 + C2 + ..., and the line integral is 


[ra=] Pax f F -dx +... 
C Ci C2 


It is also useful to think of the curve —C as the same as the curve C but with the 
opposite orientation. This means that we have the expression 


For example, we could return to our previous baroque example and consider the closed 
curve C = C1 — Cs. This curve starts at x = (0,0,0), travels along C1 to x = (1,1, 1) 
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and then returns back along C5 in the opposite direction to the arrow. From our 
previous answers, we have 


fre] Fax- | Tes gee eee 
c Ci C2 4 3 


There are lots of games that we can play like this. For example, it's sometimes useful to 
take a smooth closed curve C and decompose it into two piecewise smooth segments,. 
An example is shown in Figure 3, where we've introduced two new segments, which 
should be viewed as infinitesimally close to each other. These two new segments have 
opposite orientation and so cancel out in any integral. In this way, we can think of the 
original curve as C = C1-4- C3. We'll see other examples of these kinds of manipulations 
as we progress. 


1.3 Conservative Fields 


Here's an interesting question. In general the line integral of a vector field depends on 
the path taken. But is this ever not the case? In other words, are there some vector 
fields F for which the line integral depends only on the end points and not on the route 
you choose to go between them? 


Such a vector field F would obey 


[Pax | F - dx 
Ci C» 


for any C1 and Cs that share the same end points a and b and the same orientation. 
Equivalently, we could consider the closed curve C = C, — C and write this as 


f F-dx=0 
C 


for all closed curves C. To answer this question about vector fields, we first need to 
introduce a new concept for scalar fields. 


1.3.1 The Gradient 


Let's return to the scalar field 
@:R">R 


We want to ask: how can we differentiate such a function? 
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With Cartesian coordinates x = (z',...,2”) on R”, the scalar field is a function 


o(x',...,2”). Given such a function of several variables, we can always take partial 
derivatives, which means that we differentiate with respect to one variable while keeping 
all others fixed. For example, 


(1.14) 


If all n partial derivatives exist then the function is said to be differentiable. 


The partial derivatives offer n different ways to differentiate our scalar field. We will 
sometimes write this as 


0ó 


0$ = Ox? 


(1.15) 


where the 9; can be useful shorthand when doing long calculations. While the notation 
of the partial derivative tells us what's changing it's just as important to remember 
what's kept fixed. If, at times, there's any ambiguity this is sometimes highlighted by 


dg 
(55) x,...,0” 


nm 


writing 


where the subscripts tell us what remains unchanged as we vary xt. We won't use this 
notation in these lectures since it should be obvious what variables are being held fixed. 


The n different partial derivatives can be packaged together into a vector field. To do 
this, we introduce the orthonormal basis of vectors {e;} associated to the coordinates 
xt. The gradient of a scalar field is then a vector field, defined as 


EA 
| Om | 


Vo (1.16) 


where we’re using the summation convention in which we implicitly sum over the re- 
peated 7 = 1,...,n index. 


Because Vó is a vector field, it may be more notationally consistent to write it in 
bold font as Vo. However, Ill stick with Vo. There’s no ambiguity here because the 
symbol V only ever means the gradient, never anything else, and so is always a vector. 
It’s one of the few symbols in mathematics and physics whose notational meaning is 
fixed. 
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For scalar fields (x,y,z) in R3, the gradient is 


eo, 
Ox 


where we've written the orthonormal basis as {e;} = (X, y, 2}. 


Vo = 


Xj 


There’s a useful way to view the vector field Và. To see this, note that if we want 
to know how the function ¢ changes in a given direction n, with |n| = 1, then we just 
need to take the inner product n- Vo. This is known as the directional derivative and 
sometimes denoted Dno = n- Và. Obviously the directional derivative is maximal at 
any point x when ñ lies parallel to V¢(x). But this is telling us something important: 
at each point in space, the vector Vó(x) is pointing in the direction in which ¢(x) 
changes most quickly. 


1.3.2 Back to Conservative Fields 


First a definition. A vector field F is called conservative if it can be written as 
F=V¢ 


for some scalar field ¢ which, in this context, is referred to as a potential. (The odd 
name “conservative” derives from the conservation of energy in Newtonian mechanics 
we will see the connection to this below.) Finally, we can answer the question that we 
introduced at the beginning of this section: when is a line integral independent of the 
path? 


Claim: The line integral around any closed curve vanishes if and only if F is con- 
servative. 


Proof: Consider a conservative vector field of the form F = Vo. Well integrate 
this along a curve C that interpolates from point a to point b, with parameterisation 
x(t). We have 


^ Ob da! t d 
| r&-[vo-[ E i= f “ olx(t)) di 


where the last equality follows from the chain rule. But now we have the integral of a 


total derivative, so 


[ 56x [665]; = 6) - et) 
C 


which depends only on the end points as promised. 


=P = 


Conversely, given the vector field F whose inte- 
gral vanishes when taken around any closed curve, 
it is always possible to construct a potential ¢. We 
first choose a value of ¢ at the origin. There's no 
unique choice here, reflecting the fact that the po- 


tential o is only defined up to an overall constant. 
We can take ¢(0) = 0. Then, at any other point 
y, we define x = y 


= .d 
oly) [oF x 


where C(y) is a curve that starts at the origin and ends at the point y as shown in 
the figure above. Importantly, by assumption $ F - dx = 0, so it doesn't matter which 
curve C we take: they all give the same answer. 


It remains only to show that Vọ = F. This is straightforward. Reverting to our 
original definition of the partial derivative (1.14), we have 


x9 = him tf Fax- | F- dx 
Ort e0 € Lawes) c) 


The first integral goes along C(y), and then 


continues along the red line shown in the fig- y yteei 
ure to the right. Meanwhile, the second inte- Cly) 
gral goes back along C(y). The upshot is that 
the difference between them involves only the 
integral along the red line 
Od 1 


-(y) = lim - F - dx 
Ox? €3?0 € Jred line 


x—0 


The red line is taken to be the straight line in the z^ direction. This means that the 
line integral projects onto the F; component of the vector F. Since we're integrating 
this over a small segment of length e, the integral gives f. F - dx ~ Fie and, after 


red line 
taking the limit € — 0, we have 


This is our desired result Vo = F. 


e = 


It's clear that the result above is closely related to the fundamental theorem of 
calculus: the line integral of a conservative vector field is the analog of the integral of 
a total derivative and so is given by the end points. We'll meet more analogies along 
the same lines as we proceed. 


Given a vector field F, how can we tell if there's a corresponding potential so that 
we can write F = Vo? "Dl here's one straightforward way to check: for a conservative 
vector field, the components F = 77e; are given by 
o 
"ES 
Qus 


Differentiating again, we have 


OF, 0$ OF; 
Ori Ox'xi Ox! Pag 


where the second equality follows from the fact that the order of partial derivatives 


doesn't matter (at least for suitably well behaved functions). This means that a neces- 
sary condition for F to be conservative if that 0;F; = 0;F;. Later in these lectures we 
will see that (at least locally) this is actually a sufficient condition. 


An Example 
Consider the (totally made up) vector field 


3 


F = (3a7y sin z, z? sin z, z^y cos z) 


Is this conservative? We have 0,F) = 3a?sinz = 0,F, and O4 F; = 32?y cos z = O4F| 
and, finally, 0.F; = x? cosz = 03F5. So it passes the derivative test. Indeed, it's not 
then hard to check that 


F=Vo with ¢=2°ysinz 
Knowing this makes it trivial to evaluate the line integral f c F - dx along any curve C 
since it is given by ¢(b) — $(a) where a and b are the end points of C. 
Exact Differentials 


There is a slightly different and more abstract way of phrasing the idea of a conservative 
vector field. First, given a function ¢(x) on IR", the differential is defined to be 


dd = 9g dr! = V¢- dx 
Oz! 


It's a slightly formal object, obviously closely related to the derivative. The differential 


is itself a function of x and captures how much the function @ changes as we move in 
any direction. 


ee 


Next, consider a vector field F(x) on IR". We can take the inner product with an 
infinitesimal vector to get the object F - dx. In fancy maths language, this is called a 
differential form. (Strictly it's an object known as a differential one-form) It's best to 
think of F - dx as something that we should integrate along a curve. 


A differential form is said to be exact if it can be written as 
F - dx = dọ 


for some function $. This is just a rewriting of our earlier idea: a differential is exact 
if and only if the vector field is conservative. In this case, it takes the form F = V@ 
and so the associated differential is 


do 


ai 


F - dx = da* = dọ 


where the last equality follows from the chain rule. 


1.3.3 An Application: Work and Potential Energy 


There’s a useful application of these ideas in Newtonian mechanics. The trajectory 
x(t) of a particle is governed by Newton's second law which reads 


mx — F(x) 


where, in this context, F(x) can be thought of as a force field. An important concept 
in Newtonian mechanics is the kinetic energy of a particle, K = imk’. (This is more 
often denoted as T in theoretical physics.) As the particle’s position changes in time, 
the kinetic energy changes as 


t2 dK t2 t2 
Kit) - Kt) = | » a= f mi-&dr- | k- Fat | F-dx 
ty t ty C 


1 


The line integral of the force F along the trajectory C of the particle is called the work 
done. 


Something special happens for conservative forces. These can be written as 
F--VV (1.18) 


for some choice of V. (Note: the minus sign is just convention.) From the result above, 
for a conservative force the work done depends only on the end points, not on the path 
taken. We then have 


K(t4) — Kt) = us -dx = —V(t3)--V(tj) = K(t)+V(t) = constant 
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We learn that a conservative force, one that can be written as (1.18), has a conserved 
energy E = K +V. Indeed, it’s this conservation of energy that lends it's name to the 
more general idea of a “conservative” vector field. We'll have use of these ideas in the 
lectures on Dynamics and Relativity. 


1.3.4 A Subtlety 


Here's a curious example. Consider the vector field on R? given by 


y £ 
F-(- 
( 2i is) 


Is this conservative? If we run our check (1.17), we find 

OF, OF, y? — x? 

Oy ðr (a2 + y?)? 
which suggests that this is, indeed, a conservative field. Indeed, you can quickly check 
that 


F — Vó with ó(r,y) —tan ! (=) 


(To see this, write tan ó = y/x and recall that O(tan ¢)/Ox = (cos ¢)~?06/Ox = (1 + 
tan? ¢)0¢/0x with a similar expression when you differentiate with respect to y. A 
little algebra will then convince you that the above is true.) 


Let’s now integrate F along a closed curve C that is a circle of radius R surrounding 
the origin. We take x(t) = (E cost, Rsint) with 0 € t < 27 and the line integral is 


ars dx eh sint cos t 
F -dx= F. — dt = ——— . (—Rsi ——. tidi=2 
$ x / a / ( R (—Rsint) + R R cos ) d T 


Well, that’s annoying! We’ve just proven that the integral of any conservative vector 
field around a close curve C necessarily vanishes, and yet one of our first examples 
seems to show otherwise! What’s going on? 


The deal is that ¢(x, y) is not a well behaved function on R?. In particular, it’s not 


continuous along the y-axis: as x — 0 the function ¢ approaches either 4-7/2 or —/2 
depending on whether y/z is positive or negative. Implicit in our previous proof was 


the requirement that we have a continuous function ¢, well defined everywhere on R°. 


Strictly speaking, a conservative field should have F = Vø with $ continuous. 


Relatedly, F itself isn't defined everywhere on R? because it is singular at the origin. 
Strictly speaking, F is only defined on the plane IR? with the point at the origin removed. 
We write this as R? — {0,0}, 
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We learn that we should be careful. The line integral of a conservative vector field 


around a closed curve C is only vanishing if the vector field is well defined everywhere 
inside C. 


Usually pathological examples like this are of interest only to the most self-loathing 
of pure mathematicians. But not in this case. The subtlety that we've seen above 
later blossoms into some of the most interesting ideas in both mathematics and physics 
where it underlies key aspects in the study of topology. In the above example, the 


space R? — (0,0) has a different topology from IR? because in the latter case all loops 
are contractible, while in the former case there are non-contractible loops that circle 
the origin. It turns out that one can characterise the topology of a space by studying 
the kinds of functions that live on it. In particular, the functions that satisfy the 
check (1.17) but cannot be written as F = V@ with a continuous ¢ play a particularly 
important role, as they encode a lot of information about the topology of the underlying 
space. 
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2 Surfaces (and Volumes) 


The main purpose of this chapter is to understand how to generalise the idea of an 
integral. Rather than integrating over a line, we will instead look at how to integrate 
over a 2d surface. We'll then see how to generalise to the integration over a 3d volume 
or, more generally, an n-dimensional space. 


2.1 Multiple Integrals 


We'll start by explaining what it means to integrate over a region in IR? or over a region 
in IR. The former are called area, or surface, integrals; the latter volume integrals. By 
the time we've understood volume integrals, the extension to IR" will be obvious. 


2.1.1 Area Integrals 


Consider a region D C R?. Given a scalar function ¢: IR? + R, we want to find a way 
to integrate @ over D. We write this as 


f ó(x) dA (2.1) 


You should think of the area element d.A as representing an infinitesimally small area, 
with the f sign telling us that we're summing over many such small areas, in much 
the same was as f dx should be thought of as summing over infinitesimally small line 
elements dx. The area element is also written as dA = dz dy. 


The rough idea underlying the integral is 
straightforward. First, we find a way to tes- ^y 
selate D with some simple shape, say a rectan- 
gle or other polygon. Each shape has common 
area 6A. Admittedly, there might be some dif- 
ficulty in making this work around the edge, 


e 


but we'll ignore this for now. Then we might 


approximate the integral as T 


I ó(x)dA & X > d(x,) ðA 


where x, is a point in the middle of each shape. We can then consider making 6A 


v 


smaller and smaller, so that we tesselate the region D with finer and finer shapes. 
Intuitively, we might expect that as 0A — 0, the sum converges on an answer and, 
moreover, this answer is independent on any choices that we made along the way, such 
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as what shape we use and how we deal with the edges. When the limit exists — as it will 
for any sensible choice of function ¢ and region D — then it converges to the integral. 
If the function in the integrand is simply ¢ = 1 then the integral (2.1) calculates the 
area of the region D. 


Just as an ordinary, one-dimensional integral 
can be viewed as the area under a curve, so too ^ 
can an area integral be viewed as the volume 
under a function. This interpretation follows 
simply by plotting z = (x,y) in 3d as shown 
to the right. 


LD 
Evaluating Area Integrals 


In practice, we evaluate area integrals (or, in- 
deed, higher dimensional integrals) by reducing 


them to multiple ordinary integrals. 


There are a number of different ways to do this, and some may be more convenient 
than others, although all will give the same answer. For example, we could parcel our 
region D into narrow horizontal strips of width dy like so: 


For each value of y, we then do the x integral between the two limits z(y) and zs(y). 
We then subsequently sum over all such strips by doing the y integral between the two 
outer limits of the shape which we call a and b. The net result is 


J sonas - [^p dz ole, y) (2.2) 


In this approach, the information about the shape D appears in the limits of the integral 
x1(y) and zə(y) which trace the outline of D as y changes. 
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If your shape is suitably annoying, then one ^ 
or more of the integrals may have to be decom- 
posed into disjoint sets. An example is shown 
on the right. In this case, for some values of y 
we need two further functions z;(y) and Z2(y) 
to trace the outline of D and the integral in 
(2.2) is defined to be 


x2(y) ai(y) x2(y) 
f dx = | dz + f da 
zi(y) zi(y) &2(y) 


with the obvious generalisation if more disjoint intervals are needed. 


We should pause at this point to make a comment on notation. You may be used 
to writing integrals as f (integrand) dx, with the thing you're integrating sandwiched 
between the f sign and the dx. Indeed, that’s the convention that we've been using 
up until now. But, as you progress through mathematics, there is a time to dump this 
notation and we have now reached that time. When performing multiple integrals, it 
becomes annoying to remember where you should place all those dx’s and dy’s, not least 
because they're not conveying any further information. So we instead write integrals 
as f dx (integrand), with the dz placed next to the integral sign. There's nothing deep 
in this. It's just a different convention, albeit one that holds your hand a little less. 
Think of it like that time you took the training wheels off your bike. 


Our new notation does, however, retain the idea of ordering. You should work from 
right to left, first performing the f dx integration in (2.2) to get a function of y, and 
subsequently performing the f dy integration. 


Note also that the number of f signs is not conserved in (2.2). On the left, f dA is 
an area integral and so requires us to do two normal integrals which are then written 
explicitly on the right. Shortly we will meet volume integrals and denote them as 
J dV. Some texts prefer a convention in which there is a conservation of integral signs 
and so write area integrals as f {dA and volume integrals as f f f dV. The authors 
of these texts aren't string theorists and have never had to perform an integral in ten 
dimensions. Here we refuse to adopt this notation on the grounds that it looks silly. 


There is a different way to do the integral (2.2). We could just as well divide our 
formula D into vertical strips of width dz, so that it looks like this: 
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For each value of x, we do the y integral between the two limits yi(r) and ys(x). 
As before, these functions trace the shape of the region D. We then subsequently sum 
over all strips by doing the x integral between the two outer limits of the shape which 
we now call c and d. Now the result is 


d ya(x) 
dA- d d 2.3 
[ d(z, y) / " " roaa (2.3) 


There are other ways to divide up the region D, some of which we will meet below 
when we discuss different coordinate choices. Fubini’s theorem, proven in 1907, states 
that, for suitably well behaved functions ¢(z,y) and regions D, all different ways of 
decomposing the integral agree. We won't prove this theorem here but it guarantees 
that the result that you get from doing the integrals in (2.2) coincides with the result 
from (2.3). 
An Example 
As a simple example, consider the function 

(v, y) = ay 
integrated over the triangle D shown in the figure. 


We'll do the area integral in two different ways. If we 2 
first do the f dx integration, as in (2.2), then we have 


1 2—2y 1 r3 2—2y g [i 2 
] 5^ fay | te oy f vul -=F dy y(1 — y? = — 
D 0 0 0 3 Jo 3 Jo 15 


Meanwhile, doing the f dy integration first, as in (2.3), we have 


2 1-2/2 2 Q7 1-2/2 1 f2 r2 2 
A= Eo 2|9. -3f ape a 
nz fef dy x^y [we Hi 5 , dest >) 15 


The two calculations give the same answer as advertised. 
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u = constant 


v = constant 


Figure 4. A change of coordinates from (x,y) to (u,v). 


2.1.2 Changing Coordinates 


Our discussion above was very much rooted in Cartesian coordinates. What if we 
choose to work with a different set of coordinates on IR?? 


Consider a change of variables (x, y) — (u,v). To be a good change of coordinates, 
the map should be smooth and invertible and we will assume that this is the case. The 
region D can then equally well be parameterised by coordinates (u,v). An example is 
shown in Figure 4, with lines of constant u and constant v plotted in green. We want 
to know how to do the area integral in the (u, v) coordinates. 


Claim: The area integral can be written as 
f indu oe) = f dudo s Qus) (2.4) 
D D' 


The region D in the (x, y) plane is mapped into a different region D’ in the (u,v) plane. 
Here ó(u, v) is slightly sloppy shorthand: it means the function ó(x(u, v), y(u, v)). The 
additional term J(u, v) is called the Jacobian and is given by the determinant 


Or/Ou Ox/Ov 


Jud) = 
i Oy/Ou Oy/dv 


The Jacobian is an important enough object that it also gets its own notation and is 
sometimes written as 


— 31 = 


Proof(ish): Here is a sketch of the proof to give you some intuition for why this is the 
right thing to do. We evaluate the integral by summing over small areas 6A, formed 
by lines of constant u and v as shown by the red shaded region in Figure 4. The sides 
of this small region have length óu and óv respectively, but what is its area? It's not 
simply ĝu ðv because the sides aren't at necessarily right angles. Instead, the small 
shaded region is approximately a parallelogram. 


We think of the original coordinates as functions of the new, so x = z(u,v) and 
y = y(u, v). If we make vary u and v slightly, then the change in the original x and y 
coordinates is 


_ OF OL | Oy. | Oy 
p — a Out a out... and oy — 3 QU S OU ees 


where the +... hide second order terms O(óu?), O(óv?) and O(óuóv). This means 


that we have 
ór|  [Oz/Ou Or[Ov ðu 
oy Oy/Ou Oy/dv du 


The small parallelogram is then spanned by the two vectors a = (22, 2%)du and b = 


ðu’ ðu 
(2, BY Sy, Recall that the area of a parallelogram is |a x b|, so we have 
o 
e ei 
O(u, v) 


which is the promised result 


An Example: 2d Polar Coordinates 


There is one particular choice of coordinates that vies with Cartesian coordinates in 
their usefulness. This is plane polar coordinates, defined by 


x=pcos@ and y=psing 


where the radial coordinate p > 0 and the angular coordinate takes values in 9 € |0, 27). 
(Note: we used (x,y) to describe a general scalar field earlier in this section. This 
shouldn't be confused with the coordinate ¢ that we've introduced here.) We can easily 
compute the Jacobian to find 


J= = 


cosó —psing| | 
sng pcos@ 
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5o we learn that the area element is given by 
dA = pdpdó 


There is also a simple graphical explanation of 
this result: it follows by looking at the area of 
the rounded square shape in the figure to the 


right (again, ignoring terms second order in d¢ 
and óp). 


Let's now use this to do an integral. Let D be a 
the wedge in the (x,y) plane defined by x > 0, y > 0 
and z? + y? < R?. This is shown to the left. In polar 
coordinates, this region is given by 


O<p<R and 0<¢@< 


tol 3 


We'll integrate the function f = e-€^*v)0/? = e-9/? over the region D. In polar 


71/2 R " 
[ta] ao | dp pe? ? 
D 0 0 


where the extra power of p in the integrand comes from the Jacobian. The f dọ integral 


coordinates, we have 


just gives us 7/2, while the f dp integral is easily done. We have 


[faa =5|- ern] = Z (1 e4) 


As a final application, consider taking the limit R — oo, so that we're integrating 
over the quadrant x,y > 0. Clearly the answer is f p dA = 7/2. Back in Cartesian 
coordinates, this calculation becomes 


f fdA = ri de | dy e Gn — (/ dx gun (/ dy gii) 
D 0 0 0 " 


Comparing to our previous result, we find the well-known expression for a Gaussian 
integral 
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Figure 5. Two different ways to do a volume integral. On the left: perform the f dz integral 
first; on the right, perform the f D(z) dA area integral first. 


2.1.3 Volume Integrals 


Most of this chapter will be devoted to discussing surfaces, but this is as good a place 
as any to introduce volume integrals because they are a straightforward generalisation 
of area integrals. 


The basic idea should by now be familiar. The integration of a scalar function 
o : R? — R over a three-dimensional region V can be approximated by dividing the 
region into many small 3d pieces, each with volume óV and located at some position 
X4. You then find a way to take the limit 


fo% dV = jim, 2, $2) ôV 


In practice, we evaluate volume integrals in the same way as we evaluate area integrals: 
by performing successive integrations. If we use Cartesian coordinates (x,y,z) we have 
a number of ways to proceed. For example, we could choose to first do the f dz integral, 
subsequently leaving us with an area integral over the (x,y) plane. 


z2(x,y) 
I E E f dA f adus) 
V zi(z,y) 


This approach is shown on the left-hand side of Figure 5. Alternatively, we could first 
do an area integral over some sliver of the region V and subsequently integrate over all 
slivers. This is illustrated on the right-hand side of Figure 5 and results in an integral 


[oe z)dV = pe 2 dxdy $(x, y, 2) 


of the form 
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Figure 6. Spherical polar coordinates on the left, and cylindrical polar coordinates on the 
right. 


As before, for suitably nice functions $ and regions V, the order of integration is 
unimportant. 


There are many reasons to do a volume integral. You might, for example, want to 
know the volume of some object, in which case you just integrate the function $ = 1. 
Alternatively, it's common to integrate a density of something, which means stuff per 
unit volume. Integrating the density over the region V tells you the amount of stuff in 
V. Examples of stuff that we will meet in other courses include mass, electric charge 
and probability. 


2.1.4 Spherical Polar and Cylindrical Polar Coordinates 


If your region V is some blocky shape, then Cartesian coordinates are probably the 
right way forward. However, for many applications it is more convenient to use a 
different choice of coordinates. 


Given an invertible, smooth transformation (x,y,z) — (u,v,w) then the volume 
elements are mapped to 


dV = dz dy dz = |J| du dv dw 


with the Jacobian given by 


Or ðr Ow 

Ou Ov Ow 

J= O(a, y; z) Oy Oy Oy 
olu, v, w) ðu ðv Ow 

Ou Ov Ow 
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The sketch of the proof is identical to the 2d case: the volume of the appropriate 
parallelapiped is dV = |J |óuóvów. 


Two sets of coordinates are particularly useful. The first is spherical polar coordi- 
nates, related to Cartesian coordinates by the map 


x =rsind cos ó 
y —rsin0 sing (2.5) 


z=rcosdé 


The range of the coordinates is r € [0, oo), 0 € [0,7] and ¢ € [0, 27). The Jacobian is 


O(x, y, z) 


Urs —r^sn0 => dV —r?sin0dr d do (2.6) 


The second is cylindrical polar coordinates, which coincides with plane polar coordinates 
in the (x, y) plane, leaving z untouched 


x= pcos o 
y = psing (2.7) 
ZSZ 


with p € [0, oo) and ó € [0, 27) and, of course, z € (—oo, +00). (Later in the course, 
we will sometimes denote the radial coordinate in cylindrical polar coordinates as r 
instead of p.) This time the Jacobian is 


O(a, y, z) 


Oy "P => dV =pdpdddz 


We can do some dimensional analysis to check that these results make sense. In spher- 
ical polars we have one coordinate, r, with dimensions of length and two dimensionless 
angular coordinates. Correspondingly, the Jacobian has dimension length? to ensure 
that dV has the dimension of volume. In cylindrical polars, we have two coordinates 
with dimension of length, p and z, and just a single angular coordinate. This is the 
reason that the Jacobian now has dimension of length rather than length?. 
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Example 1: The Volume of a Sphere 


Consider a spherically symmetric function f(r). We can integrate it over a ball of 
radius R using spherical polar coordinates, with dV = r? sin 0 drd0dó to get 


fiw- [oar Pao [oss 


7 ph 
= 2n| — cos 6| f dr r? f(r) 
0 Jo 
R 
= in | dr r?f(r) 
0 
In particular, if we take f(r) = 1 then we get the volume of a sphere Vol = 47 R?/3. 


Example 2: A Cylinder Cut Out of a Sphere 


Next consider a more convoluted example: we want the volume of a sphere of radius 
R, with a cylinder of radius s < R removed from the middle. The region V is then 
ety? +2? < R?, together with x? + y? > s?. Note that we don’t just subtract the 
volume of a cylinder from the that of a sphere because the top of the cylinder isn’t flat: 
it stops where it intersects the sphere. 


In cylindrical coordinates, the region V spans s < p < Rand -/I2—p? < z < 
V R? — p?. And, of course, 0 € @ < 27. We have dV = pdpdzdó and 


20 R / R2—p? R 
va = fav = [ ao | app | de — 4n | dp py R2 — e? 
V 0 E —4/ R2—p? E 
It is now straightforward to do the integral to find the volume 
Vol — Am — s?)3/2 


Example 3: Electric Charge On a Hemisphere 


Consider a density of electric charge that increases linearly 
in the z-direction, with f(z) = foz/R, in a hemisphere H 
of radius R, with z > 0 and fp a constant. What is the 
total charge in H? 


In spherical polar coordinates, the coordinates for the 
hemisphere H are 0 € r € R and 0 € ¢ < 2m and, finally, 
0 < 0 < m/2, which restricts us to the hemisphere with 
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z 20. We integrate the function f = for cos0/ R over H with dV = r?sin0 drdéd¢ to 


find 
fo Qn 1/2 R ge 2n fo ri R 1 "- 7/2 1 2 
DE 2] ae | a | dr r° sinr cos? = sin^ 0 = -rR fi 
| R Jo 0 0 R [4], {2 "Ba d 


As a quick check on our answer, note that fo is the charge density so the dimensions 
of the final answer are correct: the total charge is equal to the charge density times a 
volume. 


Vector Valued Integrals 


We can also integrate vector valued fields F : IR? — IR? over a volume V. There's 
nothing subtle here: we just do the integral component by component and the final 
answer is also a vector. 


A common example arises when we compute the centre of mass. Let p(x) be the 
density of an object. (Note that this isn't a great choice of notation if we're working 
in cylindrical polar coordinates.) The total mass is 


M = Jj p(x) dV 
V 
and the centre of mass is 
cis f (x) dV 
= — | xp(x 
Myer" 
For example, consider again the solid hemisphere H from the previous example, covering 


0 € r € R and z > 0. Well take this object to have constant density p. The total 
mass is 


2 
M= [ pav = aR 
H 3 


Writing X = (X,Y, Z) for the centre of mass, we need to compute the three components 
individually. We have 


p p 2m R 7/2 
x-f[sw-R[ ae | ar | dO xr? sin 
M Jg M Jo 0 0 


p Qn R 7/2 
-£f ae | ar | dé r? sin? 6 cos = 0 
M Jo 0 0 


— 98 — 


where the integral f dọ cos ó = 0. A similar calculation shows that Y = 0. Indeed, the 
fact that the centre of mass lies at (X, Y) = (0,0) follows on symmetry grounds. We're 
left only computing the centre of mass in the z-direction. This is 


2m R 7/2 
p P 3 : 3R 
z=% f ev= f ae | ar | dé r? cosÜsinÓ = — 
M Jg M Jo 0 0 8 


We learn that the centre of mass sits at X = (0,0,3R/8). 


Generalisation to R” 


Finally, it is straightforward to generalise multiple integrals to IR". If we make a smooth, 


invertible change of coordinates from Cartesian x',...,2” to some other coordinates 
u',...,u” then the integral over some n-dimensional region M is 
fit) de" . datu f (z(u*)) |J| du! .. . du” 
M M' 


where the Jacobian 


is the obvious generalisation of our previous results. 


2.2 Surface Integrals 


Our next task is to understand how to integrate over a surface that doesn’t lie flat in 
R?, but is instead curved in some way in RÌ. We will start by looking at how we define 
such surfaces in the first place. 


2.2.1 Surfaces 
There are (at least) two different ways to describe a surface in IR?. 
e A surface can be viewed as the level set of a function, 
F(x,y,z)=0 


This is one condition on three variables, so results in a two-dimensional surface 
in R3. (In general, a single constraint like this results in an (n — 1)-dimensional 
space in R”. Alternatively we say that the space has codimension one.) 


e We can consider a parameterised surface, defined by the map 
x: R? > R? 


This is the extension of the parameterised curve that we discussed in Section 1. 
This now defines a dimension two surface in any space R3. 


— 39 — 


At each point on the surface, we can define a normal vector, n which points per- 
pendicularly away from the surface. When the surface is defined as the level set of 
function F(x) — 0, the normal vector lies in the direction 


n~ VEF 


To see this, note that m - VF describes the rate of change of F in the direction m. 
If m lies tangent to the surface then we have, by definition, m - VF = 0. Conversely, 
the normal to the surface n lies in the direction in which the function F changes most 
quickly, and this is VF’. 


It's traditional to normalise the normal vector, so we usually define 


1 


E. UE 
IVF]. 


n 


where we’ll say more about the choice of minus sign below. 


Meanwhile, for the parameterised surface x(u, v) € R3, we can construct two tangent 
vectors to the surface, namely 


where each partial derivative is taken holding the other coordinate fixed. Each of these 
lies within the surface, so the normal direction is 


Ox Ox 


2N Ju” v 


If n Z 0 anywhere on the surface then the parameterisation is said to be regular. Note 
that, although a parameterised surface can be defined in any R”, the normal direction 
is only unique in IR? where we have the cross product at our disposal. 


Examples 


Here are a number of examples using the definition in- 
volving a level set. A sphere of radius R is defined by 


F(x,y, z) =z +y +2 -R =0 


the normal direction is given by VF = 2(x,y,z) and 


points radially outwards. 
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Figure 7. When the hyperboloid degenerates to a cone, there is no well defined normal at 
the origin. 


A hyperboloid is defined by 
F(x,y, z) =z +y% -2 -R =0 


with normal direction given by VF = 2(x,y,—z). Note that for both the sphere and 
hyperboloid, the normal vector is nowhere vanishing because the origin x = 0 doesn’t lie 
on the surface. However, if we take the limit R — 0 then the hyperboloid degenerates 
to two cones, meeting at the origin. In this case, VF — 0 at the origin, reflecting the 
fact there is no unique direction away from the surface at this point. This is shown in 
Figure 7 


2.2.2 Surfaces with Boundaries 


A surface S can have a boundary. This is a piecewise smooth closed curve. If there are 
several boundaries, then this curve should be thought of as having several disconnected 
pieces. 


For example, we could define the surfaces above now restricted to the region z > 0. 
In this case both the sphere and hyperboloid are truncated and their boundary is the 
circle x? + y? = R? in the z = 0 plane. 


The boundary of a surface S is denoted OS with 9 the standard notation to denote 
the boundary of any object. For example, later in the lectures we will denote the 
boundary of a 3d volume V as OV. You might reasonably wonder why we use the 
partial derivative symbol ð to denote the boundary of something. There are some deep 
and beautiful reasons behind this that will only become apparent in later courses. But 
there is also a simple, intuitive reason. Consider a collection of 3d objects, all the same 
shape but each bigger than the last. We'll denote these volumes as V,. Then, roughly 


—41- 


Figure 8. Two orientations of a sphere, with the unit normal pointing outwards or inwards. 


speaking, you can view the boundary surface as 


1 
OV, = lim - (Vise \ Ve) 
e>0 € 
where V means that you remove the 3d object V, from inside the slightly larger object 


V... This, of course, looks very much like the formula for a derivative. 


This “derivative equals boundary” idea also shows up when we calculate volumes, 
areas and lengths. For example, a disc of radius r has area «r?. The length of the 
d 


boundary is (zr?) = 2«r. This relation continues to higher dimensional balls and 


spheres. 


There is something important lurking in the idea of a boundary. The boundary is 
necessarily a closed curve C, meaning that it has no end points. Another way of saying 
this is that a closed curve C itself has no boundary, or OC = 0. We see that if a curve 
arises as the boundary of a surface, then the curve itself has no boundary. This is 
captured in the slogan “the boundary of a boundary vanishes” or, in equation form, 
O?S = 0. It is a general and powerful principle that extends to higher dimensional 
objects where 0?(anything) = 0. The idea that the boundary of a boundary vanishes 
is usually expressed simply as 0? = 0. 


A couple of quick definitions. A surface is said to be bounded if it doesn't stretch 
off to infinity. More precisely, a bounded surface can be contained within some solid 
sphere of fixed radius. A surface that does stretch off to infinity is said to be unbounded. 
Obviously, the sphere is a bounded surface, while the hyperboloid is unbounded. 


Finally, a bounded surface with no boundary is said to be closed. 
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Figure 9. Two unorientable surfaces: the Mobius strip on the left, and the Klein bottle on 
the right. 


2.2.3 Orientability 


As long as the normal vector n Æ 0, we can always normalise it so that it has unit 
length. But in general, there is no canonical way to fix the sign. This is a matter of 
convention and determines what we mean by “outside the surface" and what we mean 
by “inside”. 


A surface is said to be orientable if there is a consistent choice of unit normal n which 
varies smoothly over the surface. The sphere and hyperboloid above are both orientable, 
with the two choices of an orientation for the sphere shown in Figure 8. Throughout 
these lectures we will work only with orientable surfaces. For such surfaces, a choice of 
sign fixes the unit normal everywhere and is said to determine the orientation of the 
surface. 


We note in passing that unorientable surfaces exist. You can easily make one of these 
in the comfort of your own home. Take a strip of paper and glue the two ends together. 
You've got two different ways to glue them as shown by the arrows below: 


If you glue by aligning the arrows shown on the left, then you're just left with a boring 
strip of paper. But if you glue with the arrows aligned on the right, then you end up 
with something new an exciting: an unorientable surface, known as a Mobius strip. 
You can see one that I made earlier on the left of Figure 9. If you pick a normal vector 
and evolve it smoothly around the strip then you'll find that it comes back pointing 
in the other direction. Relatedly, the Mobius strip has a single boundary, rather than 
two. 


— 43 — 


We can also construct closed, unorientable surfaces using a similar construction. This 
time we glue together two edges, like so 


Again, you have various choices. If you glue with the arrows aligned as shown on the 
left, then you'll end up with a torus which is very much orientable. If you glue with 
the arrows aligned as shown on the right then you get an unorientable surface called a 
Klein bottle. It's a little tricky to draw embedded in 3d space as it appears to intersect 
itself, but an attempt is shown on the right of Figure 9. 


2.2.4 Scalar Fields 
We're now in a position to start integrating objects over surfaces. For this, we work 


with parameterised surfaces x(u, v). 


Sit at some point (u, v) on the surface, and move in both directions by some small 
amount du and dv. This defines an approximate parallelogram on the surface, as shown 
in the figure. The area of this parallelogram is 


Ox Ox 
— X — 


ðu Ov 


where, as usual, we’ve dropped higher order 


ôS = du ou 


terms. This is called the scalar area. (We'll 
see the need for the adjective "scalar" below 
when we introduce a variant known as the vec- 


tor area.) 


Now we're in a position to define the surface 


Ox Ox 


integral of a scalar field ó(x). Given a parameterised surface S, the surface integral is 
a, X x |é(x(u, v)) (2.8) 


given by 
nz 48 — | dude ou Do 


where D is the appropriate region in the (u, v) plane. This is now the same kind of 
area integral that we learned to do in Section 2.1. 
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The area integral of a scalar field does not depend on the orientation of the surface. 
It doesn't matter what you choose as the inside of the surface S and what you choose 
as the outside, the integral of a scalar field over S always gives the same answer. In 
particular, if we integrate o = 1 over a surface S then we get the area of that surface, 
and this is always positive. This is entirely analogous to the line integral of a scalar 
field that we met in Section 1.2 that was independent of the orientation of the curve. 


Reparameterisation Invariance 


Importantly, the surface integral (2.8) is independent of the choice of parameterisation 
of the surface. To see this, suppose that we replace our original parameterisation x(u, v) 
with an alternative parameterisation x(t, v), both of which are assumed to be regular. 
We then have 
Ox  OxOuü | OxOO and Ox Ox Ou " Ox OU 
ðu  OüOu Ob Ou ðv  OüOv ðùðv 
Taking the cross-product, we have 
Ox E Ox  O(ü,v) Ox " Ox 
Qu ðv  O(u,v) Ou O9 


This means that the scalar area element can equally well be written as 


Ox Ox 


Gees dade 
Fg ag ee 


where we've used the result (2.4) which, in the current context, is du do = ee du dv. 


The essence of this calculation is the same as we saw for line integrals: the two 
derivatives 0/Ou and O/Ov in the integrand cancel the Jacobian factor under a change 
of variables. The upshot is that we can write the surface integral (2.8) using any 
parameterisation that we wish: the answer will be the same. 


An Example 


Consider a sphere of radius R. Let S be the subregion that sits at an angle 0 < a from 
the vertical. This is the grey region shown in the figure. We want to compute the area 
of this cap. 


We start by constructing a parameterisation of the sphere. This is straightforward if 
we use the spherical polar angles 0 and ¢ defined in (2.5) as parameters. We have 


x(0, 9) = R(sin 8 cos ¢, sin 8 sin ¢, cos 0) :— Re, 
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Here e, is the unit vector that points radially outwards. 
(We will also use the notation e, — f later in these 
lectures.) We can then easily calculate 


^x = R(cos 6 cos ¢, cos sin ¢, — sind) := Reg 
^: = R(— sin0 sin ¢, sin 8 cos ¢,0) :— Rsin 0 eẹ 


Here, by construction, eg and e, are unit vectors pointing in the direction of increasing 
0 and ó respectively. We'll have more to say about the triplet of vectors e,, eg and e; 
in Section 3.3. For now, we can compute 


m x = = R’ sin be, 
From this, we have the scalar area element 
dS = R? sin 0 d0 do (2.9) 


We've seen a result very similar to this before. The volume element in spherical polar 
coordinates (2.6) is dV = r° sin 0 dr d8 dó. Our area element over a sphere simply comes 
from setting r — R and ignoring the dr piece of the volume element. 


It is now straightforward to compute the area. We have 
2n a 
A= f ae | d0 R?sin0 = 2x R?(1 — cosa) 
0 0 


Note that if we set a = 7 then we get the area of a full sphere: A = 4x R?. 


2.2.5 Vector Fields and Flux 


Now we turn to vector fields. There is à particularly interesting and useful way to 
integrate a vector field F(x) over a surface S so that we end up with a number. We do 
this by taking the inner product of the vector field with the normal to the surface, n, 
so that 


f Eœ mas -. | duw (5. x x) - F(x(u,v)) (2.10) 


This is called the flux of F through S. 


The definition of the flux is independent of our choice of parameterisation: the 
argument is identical to the one we saw above for a scalar field. 
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It’s convenient to introduce some new notation. The vector area element is defined 
as 


Ox Ox 
dS = ndS = (x x x) du dv 


This has magnitude dS and points in the normal direction n. 


The flux of a vector field depends on the orientation of the surface S. This can be 
seen in the presence of the normal vector in (2.10). In the parameterised surface x(u, v), 
the choice of orientation can be traced to the parameterisation (u, v) and, in particular, 
the order in which they appear in the cross product. Changing the orientation of the 
surface flips the sign of the flux. 


The physical importance of the flux can be seen by thinking about a fluid. Let F(x) 
be the velocity field of a fluid. (Usually we would denote this as u(x) or v(x), but we've 
already used u and v as the parameters of the surface so we'll adopt the non-standard 
name F for the velocity to avoid confusion.) In a small time ót, the amount of fluid 
flowing through a small surface element ôS is given by 


Fluid Flow = Fót- nóS 


where the dot product ensures that we don't include the component of fluid that flows 
parallel to the surface. Integrating over the whole surface, we see that the flux of fluid 


Flux = f F-dS 
S 


is the amount of fluid crossing S per unit time. In other words, the flux is the rate of 
fluid flow. 


We also talk of “flux” in other contexts, where there's no underlying flow. For 
example, in our course on Electromagnetism, we will spend some time computing the 
flux of the electric field through various surfaces, f gE- dS. 


An Example 
Consider the vector field 


F = (—2,0,z) 


This is plotted in the y = constant plane in the figure. 
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We want to integrate this vector field over ARRARRRRVANNNANN 
the hemispherical cap, subtended by the an- PPP XX LAAN 
gle a that we used as an example in Section —— |. Jo, PI^ Cep 
2.2.4. This is the region of a sphere of radius we ee eee ymm 
R, spanned by the polar coordinates eS UE doliis 


0<0<a and 0<¢<2n SSAA eee 


Ses 4.4 ADP Pe ee 
APR t reer eS Fee SF 


We know from our previous work that 
dS = R?^sin0e,d0dó with e, = (sin@cos¢,sin@ sin œ, cos 0) 
In particular, we have 
F - e, = —z sin 0 cos $ + z cos 0 = R(— sin? 0 cos? ¢ + cos? 0) 


The flux through the hemispherical cap is then 
a 27 
fr -dS = f a f dọ R?sin6 [— sin? 0 cos? ó + cos? 6] 
0 0 
We use he dd cos? — « to get 


fr dS = or’ f dé sin0 [— sin? 0 + 2 cos? 6] 
0 
= rR? | cos sin? e| = 1 R? cos a sin? a (2.11) 


2.2.6 A Sniff of the Gauss-Bonnet Theorem 


The methods described in this section have many interesting applications to geometry. 
Here we sketch two important ideas. We prove neither. 


Consider a surface S and pick a point with normal n. We can construct a plane 
containing n, as shown in the figure. The intersection of the original surface and the 


plane describes a curve C that lies in S. Associated to this curve is a curvature k, 
defined in (1.6). 


Now, we rotate the plane about n. As we do so, the curve C changes and so too 
does the curvature. Of particular interest are the maximum and minimum curvatures 


Amin us KR < Kmax 
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These are referred to as principal curvatures. 
The Gaussian curvature of the surface S at our 
chosen point is then defined to be 


K = Kminfmax 


As defined, the curvature K would appear to 
have as much to do with the embedding of the 
surface in R3 as the surface itself. The theo- 


rem egregium (or “remarkable” theorem) due 
to Gauss, is the statement that this is mislead- 
ing: the curvature K is a property of the sur- 
face alone, irrespective of any choice of embedding. We say that K is intrinsic to the 
surface. 


The idea that curved surfaces have a life of their own, independent of their em- 
bedding, is an important one. It generalises to higher dimensional spaces, known as 
manifolds, which are the subject of differential geometry. In physics, curved space 
(or, more precisely, curved spacetime) provides the framework for our understanding 
of gravity. Both Riemannian geometry and its application to gravity will be covered in 
lectures on General Relativity. 


The Gaussian curvature X has a number of interesting properties. Here's one. Con- 
sider a geodesic triangle drawn on the surface as shown in Figure 10. This means that 
we connect three points with geodesics, which are lines of the shortest distance as mea- 
sured using the arc length (1.5). Let 6, 05 and 03 be the interior angles of the triangle, 
defined by the inner product of tangent vectors of the geodesic curves. Then it turns 
out that 


D 


where D is the interior region of the triangle. If the triangle is drawn on flat R?, then 
K = 0 and this theorem reduces to the well known statement that the angles of a 
triangle add up to 7. 


We can check this formula for the simple case of a triangle drawn on a sphere. If 
the sphere has radius R then the geodesics are great circles and, as we saw in Section 
1.1, they all have curvature k = 1/R. Correspondingly, the Gaussian curvature for a 
sphere is K = 1/ R?. A geodesic triangle is shown in the figure to the below: it has two 
right-angles 7/2 sitting at the equator, and an angle a at the top. 
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Figure 10. A geodesic triangle inscribed on a surface. 


The area of the region inside the triangle is A = aR? 
(so that A = 27 R? when a = 27 which is the area of 
the upper hemisphere.). We then have 


A 


which agrees with the result (2.12). 


Here's another beautiful application of the Gaussian 
curvature. Consider a closed surface S. Any such sur- 
face can be characterised by the number of holes that it has. This number of holes is 
known as the genus. Three examples are given in Figure 11: a sphere with g = 0, a 
torus with g — 1 and some kind of baked-good with genus g — 3. It turns out that if 
you integrate the Gaussian curvature over the entire surface then you get 


[Kas = 4n(1-— g) (2.13) 
S 


This result is all kinds of wonderful. The genus g tells us about the topology of the 
surface. It's a number that only makes sense when you stand back and look at the 
object as a whole. In contrast, the Gaussian curvature is a locally defined object: at 
any given point it depends only on the neighbourhood of that point. But this result 
tells us that integrating something local can result in something global. 


The round sphere provides a particularly simple example of this result. As we've 
seen above, the Gaussian curvature is K = 1/R? which, when integrated over the 


— 50 — 


Figure 11. Three closed surfaces with different topologies. The sphere has genus g = 0, the 
torus has genus g — 1 and the surface on the right has g — 3. 


whole sphere, does indeed give 47 as befits a surface of genus g = 0. However, this 
simple calculation hides the magic of the formula (2.13). Suppose that we start to 
deform the sphere. We might choose to pull it out in some places, push it inwards in 
others. We could try to mould some likeness of our face in some part of it. Everything 
that we do changes the local Gaussian curvature. It will increase in some parts and 
decrease in others. But the formula (2.13) tells us that this must, at the end of the 
day, cancel out. As long as we don't tear the surface, so its topology remains that of a 
sphere, the integral of K will always give 47. 


The results (2.12) and (2.13) are two sides of the wondrous Gauss-Bonnet theorem. 
A proof of this theorem will have to wait for later courses. (You can find a somewhat 
unconventional proof using methods from physics in the lectures on Supersymmetric 
Quantum Mechanics. This proof also works for a more powerful generalisation to higher 
dimensional spaces, known as the Chern-Gauss-Bonnet theorem.) 
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3 Grad, Div and Curl 


In this section we're going to further develop the ways in which we can differentiate. 
We'll be particularly interested in how we can differentiate scalar and vector fields. Our 
definitions will be straightforward but, at least for the time being, we won't be able to 
offer the full intuition behind these ideas. Perhaps ironically, the full meaning of how to 
differentiate will become clear only in Section 4 where we also learn the corresponding 
different ways to integrate. 


3.1 The Gradient 


We've already seen how to differentiate a scalar field 9 : R” — IR. Given Cartesian 


coordinates x’ with i = 1,...,n on IR^, the gradient of œ is defined as 
09 
Vó =e (3.1) 


Note that differentiating a scalar field leaves us with a vector field. 


The definition above relies on a choice of Cartesian coordinates. Later in this section, 
we'll find expressions for the gradient in different coordinate systems. But there is also 
a definition of the gradient that does not rely on any coordinate choice at all. This 
starts by considering a point x € IR". We don't, yet, think of x as defined by a string of 
n numbers: that comes only with a choice of coordinates. Instead, it should be viewed 
as an abstract point in IR". 


The first principles, coordinate-free definition of the gradient Vø simply compares 
the value of o at some point x to the value at some neighbouring point x + h with 
h = |h| <1. For a differentiable function ¢, we can write 


ó(x +h) = é(x) - h- Vo + O(h’) (3.2) 


where this should be thought of as the definition of the gradient V. Note that it’s 
similar in spirit to our definition of the tangent to a curve x given in (1.2). If we pick 
a choice of coordinates, with x = (xl,...,z"), then we can take h = «e; with e < 1. 
The definition (3.2) then coincides with (3.1), 


An Example 


Consider the function on R3, 


P(L, y, 2 = — cx 
( ) /y? + y? + 2? r 
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where r? = z? + y? + 2? is the distance from 
the origin. We have 


Od $ T 


Or (x2 +y F cnm E 


and similar for the others. The gradient is then 
given by 


| OYX cL yy tz2 f 
2 


Vo 


r3 r 
where, in the final expression, we’ve introduced 

the unit vector f which points out radially outwards in each direction, like the spikes 
on a hedgehog as shown in the figure. The vector field Vó points radially, decreasing as 
1/r?. Vector fields of this kind are important in electromagnetism where they describe 
the electric field E(x) arising from a charged particle. 


An Application: Following a Curve 


Suppose that we're given a curve in IR", defined by the map x : IR — R”, together 
with a scalar field @ : R” — IR. Then we can combine these into the composite map 
o(x(t)) : R — R. This is simply the value of the scalar field evaluated on the curve. We 
can then differentiate this map along the curve using the higher dimensional version of 
the chain rule. 


do(x(t)) — 8ó dx! 


dt | xri dt 
This has a nice, compact expression in terms of the gradient, 
dé(x(t)) dx 
E Ec 
dt $ dt 


This tells us how the function ¢(x) changes as we move along the curve. 


3.2 Div and Curl 


At this stage we take an interesting and bold mathematical step. We view V as an 
object in its own right. It is called the gradient operator. 


This is both a vector and an operator. The fact that V is an operator means that it’s 
just waiting for a function to come along (from the right) and be differentiated. 
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The gradient operator V sometimes goes by the names nabla or del, although usually 
only when explaining to students in a first course on vector calculus that V sometimes 
goes by the names nabla or del. (Admittedly, the latex command for V is \nabla which 
helps keep the name alive.) 


With V divorced from the scalar field on which it originally acted, we can now think 
creatively about how it may act on other fields. As we've seen, a vector field is defined 
to be a map 


F : R” > R” 
Given two vectors, we all have a natural urge to dot them together. This gives a 
derivative acting on vector fields known as the divergence 
o OF; 
V.FE- eiz -(e;F,) = — 
( Z) eo. Qa? 


where we've used the orthonormality e; : e; = 4;;. Note that the gradient of a scalar 
field gave a vector field. Now the divergence of a vector field gives a scalar field. 


The divergence isn't the only way to differentiate a vector field. If we're in R”, 
a vector field has N components and we could differentiate each of these in one of N 
different directions. This means that there are N? different meanings to the “derivative 
of a vector field”. But the divergence turns out to be the combination that is most 
useful. 


Both the gradient and divergence operations can be applied to a fields in R”. In 
contrast, our final operation holds only for vector fields that map 


F : R? > R 


In this case, we can take the cross product. This gives a derivative of a vector field 
known as the curl, 


o OF; 
VxF= c3 x (e; F;) = RR 


Or, written out in its full glory, 


VxF- (Z3 OF, OF, OF; OF 2j) 


Ox? O23’ Ax? Or!’ Ax Ox? 
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The curl of a vector field is, again, a vector field. It can also be written as the deter- 
minant 


€; €2 83 


= ð ð [o] 
VxF Oa! Ox? Ox3 


Fi Fy Fs 


As we proceed through these lectures, we'll build intuition for the meaning of these 
two derivatives. We will see, in particular, that the divergence V - F measures the net 
flow of the vector field F into, or out of, any given point. Meanwhile, the curl V x F 
measures the rotation of the vector field. A full understanding of this will come only 
in Section 4 when we learn to undo the differentiation through integration. For now 
we will content ourselves with some simple examples. 


Simple Examples 
Consider the vector field 
F(x) = (27,0, 0) 


Clearly this flows in a straight line, with increasing strength. It has V-F = 2z, reflecting 
the fact that the vector field gets stronger as x increases. It also has V x F = 0. 


Next, consider the vector field 


A A a ee NA 


F(x) = (y, =b; 0) ARAE ENEA th 
This swirls, as shown in the figure on the right. We "PE qa detenti à 
have V. F = 0 and V x F = (0,0, —2). The curl AA CSA AAR 


points in the z direction, perpendicular to the plane ES 
of the swirling. LÀ Spine ep "P 


WOW Sw, C9 eo n7 ^ A p ae 


WW wc, 9m n nt a n ae ae 


Finally, we can consider the hedgehog-like radial 
vector field that we met previously, 
f 1 
Ee r2 E (x? + y2 J 22)3/2 (2. y; z) (3.5) 


You can check that this obeys V-F = 0 and V x F = 0. Or, to be more precise, it obeys 


these equations almost everywhere. Clearly something fishy is going on at the origin 
r — 0. In fact, we will later see that we can make this less fishy: a correct statement is 


V. F = 4nó (x) 


where ó?(x) is the higher-dimensional version of the Dirac delta function. We'll under- 
stand this result better in Section 5 where we will wield the Gauss divergence theorem. 
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When evaluating the derivatives of radial fields, like the hedgehog (3.5), it's best to 
work with the radial distance r, given by r? = zizt. Taking the derivative then gives 
2rdr/Ox' = 2x* and we have Or/Oz? = x'/r. You can then check that, for any integer 
P, 


O(r?) 
Ox 


Vr? — ej = pr? l$ 


Meanwhile, the vector x = z;e; can equally well be written as x = r = rf which 
highlights that it points outwards in the radial direction. We have 


ðr 
V-r= [= On =n 
Oz 
where the n arises because we're summing over all i = 1,...,n. (Obviously, if we're 


working in IR? then n — 3.) We can also take the curl 
qj 
Vxr-— Eijk 3 i Ok —0 


which, of course, as always holds only in R?. 


3.2.1 Some Basic Properties 


There are a number of straightforward properties obeyed by grad, div and curl. First, 
each of these is a linear differential operator, meaning that 


V (a +Y) = aVó + Vw 


V-(aF+G)=aV-F+V-G 
V x (@aF+G)=aVxF+VxG 


for any scalar fields ¢ and v, vector fields F and G, and any constant a. 


Next, each of them has a Leibniz properties, which means that they obey a general- 
isation of the product rule. These are 


V(ov) = VY c v VÓ 
V-(9F) = (V) F c 9(V - F) 
V x (GF) = (Vd) xF -ó(V x F) 
In the last of these, you need to be careful about the placing and ordering of V, just 


like you need to be careful about the ordering of any other vector when dealing with 
the cross product. The proof of any of these is simply an exercise in plugging in the 
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component definition of the operator and using the product rule. For example, we can 
prove the second equality thus: 


_ OF) 09, , OF 
VE) Or — oj | 957 


There are also a handful of further Leibnizian properties involving two vector fields. 


- (Và). F 9(V-F) 


The first of these is straightforward to state: 
V-(F x G)=(V x F)-G-F.-(V x G) 


This is simplest to prove using index notation. Alternatively, it follows from the usual 
scalar triple product formula for three vectors. To state the other properties, we need 
one further small abstraction. Given a vector field F and the gradient operator V, we 
can construct further differential operators. These are 


o o 
F.V sir and FxV = estikPis G 
Note that the vector field F sits on the left, so isn’t acted upon by the partial derivative. 
Instead, each of these objects is itself a differential operator, just waiting for something 
to come along so that it can differentiate it. In particular, these constructions appear 
in two further identities 
V(F-G) =Fx (Vx G)+Gx (Vx F)+(F-V)G+(G-V)F 
Vx (Fx G) =(V-G)F-(V-F)G+(G-V)F-(F-V)G 
Again, these are not difficult to prove: they follow from expanding out the left-hand 
side in components. 
3.2.2 Conservative is Irrotational 


Recall that a conservative vector field F is one that can be written as 
F = Vó 


for some scalar field ¢. We also say that F is irrotational if V x F = 0. There is a 
beautiful theorem that says these two concepts are actually equivalent: 


Theorem: (Poincaré lemma) For fields defined everywhere on IR?, conservative is the 
same as irrotational. 


vVvxF=0 <> EVO 
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Half Proof: It is trivial to prove this in one direction, Suppose that F = V6, so that 
F; = 0;¢. Then 


VxF= Eijk P jer = Eijk OjO ei = 0 


which vanishes because the €;;, symbol means that we're anti-symmetrising over tj, 
but the partial derivatives 0,0; are symmetric, so the terms like 010» — 050, cancel. 


It is less obvious that the converse statement holds, i.e. that irrotational implies 
conservative. We'll show this only in Section 4.4 where it appears as a corollary of 


Stokes’ theorem. 


Recall that in Section 1.3 we showed that the line integral of a conservative field was 
independent of the path taken. Putting this together with the result above, we have 
we have the following, equivalent statements: 


VxF=0 <_< F2=Vo <> f F-dx=0 
a 
where we've yet to see the proof of the first =>. 


3.2.3 Solenoidal Fields 


Here is another definition. A vector field F is called divergence free or solenoidal if 
V.F — 0. (The latter name comes from electromagnetism, where a magnetic field B 
is most easily generated by a tube with a bunch of wires wrapped around it known as 
a “solenoid” and has the property V - B = 0.) 


There is a nice theorem about divergence free fields that is a counterpart to the one 
above: 


Theorem: Any divergence free field can be written as the curl of something else, 


V-F=0 = F=VxA 


again, provided that F is defined everywhere on R3. Note that A is not unique. In 
particular, if you find one A that does the job then any other A+ Vó will work equally 
as well. 


Proof: It's again straightforward to show this one way. If F = V x A, then F; = 
cij;O; Ay and so 


which again vanishes for the symmetry reasons. 
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This time, we will prove the converse statement by explicitly exhibiting a vector 
potential A such that F = V x A. We pick some arbitrary point xo = (xo, yo, zo) and 
then construct the following vector field 


Ate (f Faenza, | Ead- f Fand, 0) (3.6) 


20 To 20 


Since A, = 0, the definition of the curl (3.4) becomes 


vxa=( OA, OA, m) 


| Oz ° Oz? Oa Oy 


Using the ansatz (3.6), we find that the first two components of V x A immediately 
give what we want 


(Vx A)s = F,(r,y,z) and (V x A), = F,(z, y,z) 


both of which follow from the fundamental theorem of calculus. Meanwhile, we still 
have a little work ahead of us for the final component 


B oF, "-"-- "ORE, "ww 
(V x A). = Fena) f. Ay oye) dz y dy (x,y, 2’) dz 


At this point we use the fact that F is solenoidal, so V - F = 0 and so OF,/0z = 
—(OF,,/Ox + OF,/0y). We then have 


Z F, / 1 
(Vx A = Feyzo) + | PES, y, 2! de = F(z,y,2) 


0 


This is the result we want. 


Note that both theorems above come with a caveat: the fields must be defined 
everywhere on R3. This is important as counterexamples exist that do not satisfy this 
requirement, similar to the one that we met in a previous context in Section 1.3.4. 
These counterexamples will take on a life of their own in future courses where they 
provide the foundations to think about topology, both in mathematics and physics. 


3.2.4 The Laplacian 


The Laplacian is a second order differential operator defined by 
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For example, in 3d the Laplacian takes the form 


2 PË ge 


2 — | 
V = bgt? Opt Oe 


This is a scalar differential operator meaning that, when acting on a scalar field 6, it 
gives back another scalar field V2¢. Similarly, it acts component by component on a 
vector field F, giving back another vector field V?F. If we use the vector triple product 
formula, we find 


Vx(VxF)=V(V-F)-V°F 


which we can rearrange to give an alternative expression for the Laplacian acting on 
the components of a vector field 


V?F — V(V-F) - V x (V x F) 
We'll devote Section 5 to solving various equations involving the Laplacian. 


3.2.5 Some Vector Calculus Equations in Physics 


I mentioned in the introduction that all laws of physics are written in the language 
of vector calculus (or, in the case of general relativity, a version of vector calculus 
extended to curved spaces, known as differential geometry). Here, for example, are the 
four equations of electromagnetism, known collectively as the Maxwell equations 


p OB 
ApS E=- l 
V Es V x At (3.7) 
V-B=0 ; vxB-a (1a) 


Here E and B are the electric and magnetic fields, while p(x) is a scalar field that 
describes the distribution of electric charge in space and J(x) is a vector field that 
describes the distribution of electric currents. The equations also include two constants 
of nature, e; and uo which describe the strengths of the electric and magnetic forces 
respectively. 


This simple set of equations describes everything we know about the electricity, 
magnetism and light. Extracting this information requires the tools that we will develop 
in the rest of these lectures. Along the way, we will sometimes turn to the Maxwell 
equations to illustrate new ideas. 
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You'll find the Laplacian sitting in many other equations of physics. For example, 
the Schródinger equation describing a quantum particle is written using the Laplacian. 
A particularly important equation, that crops up in many places, is the heat equation, 

OT 
ot 


This tells us, for example, how temperature T(x, t) evolves over time. Here D is called 


= DV’T 


the diffusion constant. This same equation also governs the spread of many other 
substances when there is some random element in the process, such as the constant 
bombardment from other atoms. For example, the smell of that guy who didn’t shower 
before coming to lectures spreads through the room in manner described by the heat 
equation. 


3.3 Orthogonal Curvilinear Coordinates 


The definition of all our differential operators relied heavily on using Cartesian co- 
ordinates. The purpose of this section is simply to ask what these objects look like 
in different coordinate systems. As usual, the spherical polar and cylindrical polar 
coordinates in R3 will be of particular interest to us. 


In general, we can describe a point x in IR? using some coordinates u, v, w, so x = 
x(u,v,w). Changing either of these coordinates, leaving the others fixed, results in a 
change in x. We have 


dx = — du 4 dv 4 dw (3.8) 


Here Ox/0u is the tangent vector to the lines defined by v, = constant, with similar 
statements for the others. A given set of coordinates provides a good parameterisation 


of some region provided that 
Ox (Ox Ox 
ZH n n) i 


The coordinate (u, v, w) are said to be orthogonal curvilinear if the three tangent vectors 
are mutually orthogonal. Here the slightly odd name “curvilinear” reflects the fact that 
these tangent vectors are typically not constant, but instead depend on position. We'll 
see examples shortly. 


For orthogonal curvilinear coordinates, we can always define orthonormal tangent 
vectors simply by normalising them. We write 


Ox Ox E Ox 


— = hey ; — = hyey 
Ou e Ow . 
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where we've introduced scale factors hu, hy, hw > 0 and e,, e, and ew form a right- 
handed orthonormal basis so that e, x e, — e,. This can always be achieved simply by 
ordering the coordinates appropriately. Our original equation (3.8) can now be written 
as 


dx = h,e,du + h,e,dv + he, dw (3.9) 
Squaring this, we have 
dx? = h? du? + h? dv? + h? dw? 


from which it's clear that h,, h, and h, are scale factors that tell us the change in 
length as we change each of the coordinates. 


Throughout this section, we'll illustrate everything with three coordinate systems. 
Cartesian Coordinates 
First, Cartesian coordinates are easy: 
x= (x,y,z) = hQ—hh—1 and e,=ĮÅ, e; =f, e,=2 
Cylindrical Polar Coordinates 
Next, cylindrical polar coordinates are defined by (see also (2.7)) 
x = (pcos @, psin $, z) 


with p > 0 and ¢ € [0, 2x) and z € R. Inverting, 


p= /x2+y? and tang = 4 


x 


It’s straightforward to calculate 


e, = p = (cos¢, sin ¢, 0) 
ey = b= (— sin $, cos $, 0) 
e =Z 


with 
hp=h,=1 and hy=p 


The three orthonormal vectors are shown on the left-hand side of Figure 12 in red. 
Note, in particular, that the vectors depend on $ and rotate as you change the point 
at which they’re evaluated. 
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Figure 12. Cylindrical polar coordinates, on the left, and spherical polar coordinates, on 
the right. 


Spherical Polar Coordinates 
Spherical polar coordinates are defined by (see also (2.5).) 


x = (r sin cos ó, r sin ô sin ¢, r cos 0) 


with r > 0 and 0 € [0,7] and 9 € [0, 27). Inverting, 


/ g2 + 2 
r=~Vr2+y2+22 , anpa , tang = 2. 
z z 
Again, we can easily calculate the basis vectors 


e, = f = (sin f cos à, sin Ü sin ¢, cos 0) 


eg = ĝ = (cos 0 cos $, cos 0 sin à, — sin 0) 


e; = $ = (— sing, cos ¢, 0) 


These are shown in the right-hand side of Figure 12 in red. This time, the scaling 
factors are 


h.=1 , họ=r o, Ayersno 


We'll now see how various vector operators appear when written in polar coordinates. 


3.3.1 Grad 


The gradient operator is straightforward. If we shift the position from x to x + ôx, 
then a scalar field f(x) changes by 


df = V f - dx (3.10) 
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This definition can now be used in any coordinate system. In a general coordinate 
system we have 


OE 
—. Qu 


df du4 dv 4 Of 
Ow 


dw = V f - (h,e,du + hye,dv + hyewdw) 


Using the orthonormality of the basis elements vectors, and comparing the terms on 
the left and right, this then gives us the gradient operator 


1 Of 1 of 1 ðf 


wS hu Du hy 9o hw apu 


(3.11) 


In cylindrical polar coordinates, the gradient of a function f(p, $, z) is 


Of, , 10fs of, 
95^ t 1552 z^ 


In spherical polar coordinates, the gradient of a function f(r,0, $) is 


Vf - 


Of. 10f, 1 Of. 


Vj = og F not Fane T 


Note, in particular, that when we differentiate with respect to an angle there is always 
a compensating 1/length prefactor to make sure that the dimensions are right. 


3.3.2 Div and Curl 


To construct the div and curl in a general coordinate system, we first extract the vector 
differential operator 
1 o 1 o 


VE S d ae 
B us h Oe hy Ow 


(3.12) 


where, importantly, we’ve placed the vectors to the left of the differentials because, as 
we've seen, the basic vectors now typically depend on the coordinates. If we act on a 
function f with this operator, we recover the gradient (3.11). But now we have this 
abstract operator, we can also take it to act on a vector field F(u, v, w). We can expand 
the vector field as 


F(u,v,w) = Fie, + Foes + Fey 


Each of the components depends on the coordinates u, v and w. But so too, in general, 
do the basis vectors (e,, €v, €w}. This means that when the derivatives in the differential 
operator (3.12) hit F, they also act on both the components and the basis vectors. 
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Given an explicit expression for the basis vectors, it's not hard to see what happens 
when they are differentiated. For example, in cylindrical polar coordinates we find 


1 O(pF,) LOFs OF. 
p Op pOó Oz 


10F, OF,\. (OF, OF,N4 1(A(pFs) OF ) . 
F — eS cio P i — P 
ve o n) "AE a Op 00. 


There is a question on Examples Sheet 2 that asks you to explicitly verify this. Mean- 


V-F= 


and 


while, in spherical polar coordinates, we have 
10? F,) 1 O(snOFe) 1 OF, 


V.FE- 


r^ Qr r sin 90 rsin Od 
and 
1 O(sm0F,) OFN. 
vxF--xs[ roe) 
1 ( 1 OF, _ 2) à 
r \sin@ 0d Or 
1 /Ə(rFə) OF,\ - 
r ( ðr | 00 ) o 


For completeness, we also give the general results 


Claim: Given a vector field F(u, v, w) in a general orthogonal, curvilinear coordinate 
system, the divergence is given by 


hu huh \Ou ð ðw 


and the curl is given by 


v.p- 1 (tts + È uh; + a sh Fa) (3.13) 


i h,e, h,e, hey 
= ð [2] [o] 
VxF- hyhyhw ðu ðv ðw 
hauFu hy E, hy E, 


where the derivatives on the second line should now be thought of as acting on the 
third line only, but not the first. This means that, in components, we have 


VxE- E (Zure) — UG) + two similar terms 


Proof: Not now. Later. It turns out to be a little easier when we have some integral 
technology in hand. For this reason, we'll revisit this in Section 4.4.4. 
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3.3.3 The Laplacian 
Finally, we have the Laplacian. From (3.11) and (3.13), this takes the general form 


T = 1 Qu lad 9 huhu Of ; i8 huh, Of 
AN AS. m E hı Qu) Ov h, Ov) Ow\ hy» Ow 


Obviously in Cartesian coordinates, the Laplacian is 


Pf Pf f 


2 
Ve ðr? ôy Os 


In cylindrical polar coordinates it takes the form 


ler gr. dr uw 
P 8p pog? 82 


vere 


= 3.14 
207 (3.14) 


and in spherical polar coordinates 


235 18 f 0f 1 Qf. of 1 Off 
VIS rr Tni une T Phan? 0 Bg? (3:15) 


The most canonical of canonical physics textbooks is J.D. Jackson’s “Classical Elec- 
trodynamics". I don't know of any theoretical physicist who doesn't have a copy on 
their shelf. It's an impressive book but I'm pretty sure that, for many, the main selling 
point is that it has these expressions for div, grad and curl in cylindrical and polar 
coordinates printed on the inside cover. You can also find these results collated on the 
last pages of these lecture notes. We'll return to the Laplacian in different coordinate 
systems in Section 5.2 where we'll explore the solutions to equations like V? f = 0. 
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4 The Integral Theorems 


The fundamental theorem of calculus states that integration is the inverse of the dif- 
ferentiation, in the sense that 


In this section, we describe a number of generalisations of this result to higher dimen- 
sional integrals. Along the way, we will also gain some intuition for the meaning of the 
various vector derivative operators. 


4.1 The Divergence Theorem 


The divergence theorem, also known as Gauss’ theorem, states that, for a smooth vector 
field F(x) over R3, 


V.FdV = | F.dS (4.1) 
a 


where V is a bounded region whose boundary OV = S is a piecewise smooth closed 
surface. The integral on the right-hand side is taken with the normal n pointing 
outward. 


The Meaning of the Divergence 


We'll prove the divergence theorem shortly. But first, let's make good on our promise 
to build some intuition for the divergence. To this end, integrate V -F over some region 
of volume V centred at the point x. If the region is small enough, then V - F will be 
roughly constant, and so 


J v: Fav ev v Fo) 
V 


and this becomes exact as the region shrinks to zero size. The divergence theorem them 
provides a coordinate independent definition of the divergence 
. 1 
V.F-dm vf Feds (4.2) 
V350 V Js 
This is the result that we advertised in Section 3: the right way to think about the 
divergence of a vector field is as the net flow into, or out of, a region. If V -F > 0 at 


some point x, then there is a net flow out of that point; if V -F < 0 at some point x 
then there is a net flow inwards. 
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We can illustrate this by looking at a couple of 
the Maxwell equations (3.7). The magnetic field B is 
solenoidal, obeying 


V:-B=0 


This means that the magnetic vector field can’t pile up 
anywhere: at any given point in space, there is as much magnetic field coming in as 
there is going out. This leads us to draw the magnetic field as continuous, never ending 
streamlines. For example, the magnetic field lines for solenoid, a long coil of wire 
carrying a current, is shown in the figure (taken from the website hyperphysics). 


Meanwhile, electric field E obeys 


SES 
€9 

where p(x) is the electric charge density. In any region of 

space where there’s no electric charge, so p(x) = 0, the 

electric field lines act just like the magnetic field and can’t 

pile up anywhere. However, the presence of electric charge 

changes this, and causes the field lines to pile up or disap- 

pear. In other words, the electric charge acts as a source or 

a sink for electric field lines. The electric field lines arising from two pointlike, positive 
charges which act as sources, are shown in the figure. 


Example 


Before proving the theorem, we first give an example. Take 
the volume V to be the solid hemispherical ball, defined as 


z 
x£? +y? +2? < R? and z > 0. Then boundary of V then Sı 
has two pieces bh 

OV = S + Se y 
where 5$; is the hemisphere and S5 the disc in the z = 0 z So 


plane. We'll integrate the vector field 
F = (0,0, z + R) 


The +R doesn't contribute in the volume integral since we have V - F = 1. Then 


2 
] ve | av - gie (4.3) 
V V 3 
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which is the volume of the hemispherical ball. For the surface integral, we work with 
Sı and Sə separately. On the hemisphere Sı, the unit normal vector is n = ACA z) 
and so 


F-n= Hm = R cos 0(cos 0 + 1) 


where we've used polar coordinates z = Rcos@. The integral is then 


2n a /2 
i F - dS = f ae | d0 (R? sin 0) R cos 0(cos 0 + 1) 
81 0 0 


e 5r R9 
| 3 


= 27 R? xc — Y ooet) 
3 2 ô 


where the R? sin 0 factor in the first line is the Jacobian that we previously saw in (2.9). 
Meanwhile, for the integral over the disc $5, we have the normal vector n — (0,0, —1), 
and so (remembering that the disc sits at z — 0), 


F-n=-R > F -dS = (—R) x TR? 
S2 


with mR? the area of the disc. Adding these together, we have 


2 
| F -dS = Îr R? 
S1 +S2 3 


which reproduces the volume integral as promised. 


4.1.1 A Proof of the Divergence Theorem 


We start by giving an informal sketch of the basic idea underlying the divergence 
theorem. We’ll then proceed with a more rigorous proof. 


To get some intuition for the divergence theorem, 
take the volume V and divide it up into a bunch of pem 
small cubes. A given cube V, has one corner of the dz 
cube sitting at x = (x,y,z) and sides of lengths dz, 
dy and dz. For a small enough cube, we can think of (x,y, z) 


oy 


F as being approximately constant on any given side. 
The flux of F through the two sides that lie in the (y, z) plane is given by 
OF. 


“ox by Oz 


[F(a + dx, y, z) — Fon, y, 2)] by ôz ~ 7 


— 69 — 


where the minus sign comes because the flux is calculated using the outward pointing 
normal and the right-hand side comes from Taylor expanding F(x + ôx, y,z). We get 
similar expressions for the integrals over the sides that lie in the (x,y) plane and in 
the (x, z) plane. Now sum over all such small boxes in V. The contributions on the 
left-hand side from any interior walls will cancel out, leaving us just with a surface 
integral over S — OV , while all the contributions add on the left-hand side. We're left 


with 
[a= | vrav 
S V 


The derivation above is simple and intuitive, but it might 


This is the claimed result. 


leave you a little nervous. The essence of the divergence the- 

orem is to relate a bulk integral to a boundary integral. But 

it's not obvious that the boundary can be well approximated 

by stacking cubes together. To give an analogy, if you try to 

approximate a 45° line by a series of horizontal and vertical 

lines, as shown on the right, then the total length of the steps 

is always going to be v2 larger than the length of the horizontal line, no matter how 
fine you make them. You might worry that these kind of issues afflict the proof above. 
For that reason, we now give a more careful derivation of the divergence theorem. 


Before we proceed, first note that, suitably interpreted, the divergence theorem holds 
in arbitrary dimension IR”, where a “surface” now means a codimension one subspace. 
In particular, the divergence theorem holds in R?, where a surface is a curve. This 
result, which is interesting in its own right, will serve as a warm-up exercise to proving 
the general divergence theorem. 


The 2d Divergence Theorem: Let F be a vector field in R?. Then 


] v 9a | Ens (4.4) 
D e 


where D is a region in R?, bounded by the closed curve C and n is the outward normal 
to C. 


Proof of the 2d Divergence Theorem: For simplicity, we'll assume that F 


F(r,y)y. The proof that we're about to give also works if F points solely in the x 
direction, but a general F is just a linear sum of the two. 
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We then have 


y4 (x) 
fv- ras af u y+ (x) 
D X y- (x) Oy 


where, as the notation shows, we’ve chosen to do 
the area integral by first integrating over y, and — y. (x) 


then over x. We'll assume, for now, that the region 


D is convex, as shown in the figure, so that each X 


J dy is over just a single interval with limits y4 (x). 
These limits trace out an upper curve C4, shown in red in the figure, and a lower curve 
C_ shown in blue. We then have 


[vea f ae (Feu) - Fev) 


We've succeeded in converting the area integral into an 
ordinary integral, but it's not quite of the line integral form 
that we need. The next part of the proof is to massage the 
integral over f dx into a line integral over fds. This is 


easily achieved if we look at the zoomed-in figure to the 
right. Along the upper curve C,, the normal n points 
upwards and makes an angle cos0 = y -n with the vertical. Moving a small distance 
ds along the curve is equivalent to moving 


ôx = cos ôs = ý -nôs along C, 


Along the lower curve, C_, the normal n points downwards and so y - n is negative. 
We then have 


ôx = —y-nós along C. 


The upshot is that we can write the area integral as 


ADEM (n: F(z, y. (2) +n- F(x, y-(2)) 


= Fonds f F-nds 
Cy E 
= | Fonds 

C 


with C = C, + C. = 0D the boundary of the region. 
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Figure 13. Performing the f dz integral for the proof of the 3d divergence theorem. 


We're left with one small loophole to close: if the 
region D is not convex, then the range of the inte- 


gral f dy may be over two or more disconnected in- | 
tervals, as shown in the figure. In this case, the bound- . 


ary curve decomposes into more pieces, but the basic x 
strategy still holds. 


Proof of the 3d Divergence Theorem 


The proof of the 3d (or, indeed, higher dimensional) divergence theorem follows using 
the same strategy. If we focus on F = F (x,y,z) 2 we have 


z4 (my) 
fyra- [an] e 
v D Joy — 9 
= f dA (Fic ze) - Pesos i (s) 


where the limits of the integral z(x,y) are the upper and lower surfaces of the volume 


V. The area integral over D is an integral in the (x, y) plane, while to prove Gauss’ 
theorem we need to convert this into a surface integral over S = OV. This step of the 
argument is the same as before: at any given point, the different between dA = drdy 
and dS is the angle cos? = n-z (up to a sign). This then gives the promised result 
(4.1). 


c. = 


The Divergence Theorem for Scalar Fields 


There is a straightforward extension of the divergence theorem for scalar fields ¢: 
Claim: For S = OV, we have 


[ vov = f oas 


Proof: Consider the divergence theorem (4.1) with F = ga where a is a constant 
vector. We have 


[9 aay = [ (0a) -as m" a: ( f voav - f eas) =0 


This is true for any constant vector a, and so the expression in the brackets must itself 


vanish. 


4.1.2 Carl Friedrich Gauss (1777-1855) 


Gauss is regarded by many as the greatest mathematician of all time. He made seminal 
contributions to number theory, algebra, geometry, and physics. 


Gauss was born to working class parents in what is now Lower Saxony, Germany. In 
1795 he went to study at the university of Góttingen and remained there for the next 
60 years. 


There are remarkably few stories about Gauss that do not, at the end of the day, 
boil down to the observation that he was just really good at maths. There is even a 
website that has collected well over 100 retellings of how Gauss performed the sum 


Ds n when still a foetus. (You can find an interesting dissection of this story here.) 


4.2 An Application: Conservation Laws 


Of the many important applications of the divergence theorem, one stands out. In 
many situations, we have the concept of a conservation law: some quantity that doesn't 
change over time. There are conservation laws in fundamental physics, including energy, 
momentum, angular momentum and electric charge and several more than emerge when 
we look to more sophisticated theories. There are also approximate conservation laws 
at play when we model more complicated systems. For example, if you're interested in 
how the population distribution of some species evolves over time then it might well 
serve you to ignore birth rates and traffic accidents and consider the total number of 
animals to be fixed. 


— 73 — 


In all these cases, the quantity is conserved. But we can say something stronger 
than that: it is conserved locally. For example, an electric charge sitting in the palm 
of your hand can't disappear and turn up on Jupiter. That would satisfy a "global" 
conservation of charge, but that's not the way the universe works. If the electric charge 
disappears from your hand, then most likely it has fallen off and is now sitting on the 
floor. Or, said more precisely, it must have moved to a nearby region of space. 


The divergence theorem provides the technology to describe local conservation laws 
of this type. First, we introduce the density p(x,t) of the conserved object. For 
the purposes of this discussion, we will take this to be the density of electric charge, 
although it could equally well be the density of any of the other conserved quantities 
described above. The total electric charge in some region V is then given by the integral 


Q= f pav 


The conservation of charge is captured by the following statement: there exists a vector 
field J(x, t) such that 


This is known as the continuity equation and J is called the current density. 


The continuity equation doesn't tell us that the density p can't change in time; that 
would be overly prohibitive. But it does tell us that p must change only in a certain 
way. This ensures that the change in the charge Q in a fixed region V is given by 


ae [ fa =-[v-sav=-fs-as 


where the second equality follows from the continuity 

equation and the third from the divergence theorem 

at some fixed time t. We learn that the charge inside 

a region can only change if there is a current flowing J 
through the surface of that region. This is how the 

conservation of charge is enforced locally. 


The intuition behind this idea is straightforward. If you want to keep tabs on the 
number of people in a nightclub, you don’t continuously count them. Instead you 
measure the number of people entering and leaving through the door. 
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If the current is known to vanish outside some region, so J(x) = 0 for |x| > R, 
then the total charge contained inside that region must be unchanging. Often, in such 
situations, we ask only that J(x,t) — 0 suitably quickly as |x| — oo, in which case the 
total charge is unchanging 


AQ total 


e) 
dt 


Qtotal = l pdV and 
R3 
In later courses, we'll see many examples of the continuity equation. The example of 
electric charge discussed above will be covered in the lectures on Electromagnetism, 
where the flux of J through a surface S is 


r= [3:48 
S 


and is what we usually call the electric current. 


We will also see the same equation in the lectures on Quantum Mechanics where p(x) 
has the interpretation of the probability density for a particle to be at some point x 
and Q — fy p dV is the probability that the particle sits in some region V. Obviously, 
in this example we must have Qiotal = 1 which is the statement that particle definitely 
sits somewhere. 


Finally, the continuity equation also plays an important role in Fluid Mechanics 
where the mass of the fluid is conserved. In that case, p(x,t) is the density of the fluid 
and the current is J = pu where u(x,t) is the velocity field. The continuity equation 
then reads 


In this case the flux is the mass of fluid that passes through a surface S in time t. 


In many circumstances, liquids can be modelled as incompressible, meaning that 
p(x, t) is a constant in both space and time. In these circumstances, we have p = Vp = 0 
and the continuity equation tells us that the velocity field is necessarily solenoidal: 


V-u=0 (4.5) 


This makes sense: for a solenoidal vector field, the flow into any region must be accom- 
panied by an equal outgoing flow, telling us that the fluid can’t pile up anywhere, as 
expected for an incompressible fluid. The statement that fluids are incompressible is a 
fairly good approximation until we come to think about sound, which arises because of 
changes in the density which propagate as waves. 
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4.2.1 Conservation and Diffusion 


There is a close connection between conserved quantities and the idea of diffusion. We'll 
illustrate this with the idea of energy conservation. The story takes a slightly different 
form depending on the context, but here we'll think of the energy contained in a hot 
gas. First, since energy is conserved there is necessarily a corresponding continuity 
equation 

OE 

—+1iV-J=0 4.6 

At (4.6) 
where £(x,t) is the energy density of the gas, and J is the heat current which tells us 
how energy is transported from one region of space to another. 


At this point we need to invoke a couple of physical principles. First, the energy 
density in a gas is proportional to the temperature of the gas, 


E(x, t) = cT(x,t) (4.7) 


where cy is the specific heat capacity. Next comes a key step: in hot systems, where 
everything is jiggling around randomly, the heat flow is due to temperature differences 
between different parts of the system. The relation between the two is captured by the 
equation 


J = -KVT (4.8) 


where & is called the thermal conductivity and the minus sign ensures that heat flows 
from hot to cold. This relation is known as Fick’s law. Neither (4.7) nor (4.8) are 
fundamental equations of physics and both can be derived from first principles by 
thinking about the motion of the underlying atoms. (This will be described in the 
lectures on Statistical Physics and, for Fick's law, the lectures on Kinetic Theory.) 


Combining the continuity equation (4.6) with the definition of temperature (4.7) and 

Fick's law (4.8), we find the heat equation 

OT 

— = DV?T 

Ot 
where the diffusion constant is given by D = &/c. This tells us how the temperature 
of a system evolves. As we mentioned previously, the same heat equation describes the 
diffusive motion of any conserved quantity. 
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4.2.2 Another Application: Predator-Prey Systems 


We'll see more applications of the divergence theorem in Section 5, mainly in the con- 
text of the gravitational and electrostatic forces. However, the uses of the theorem are 
many and varied and stretch far beyond applications to the laws of physics. Here we 
give an example in the world of ecology which is modelled mathematically by differ- 
ential equations. As we'll see, the use of V here is somewhat novel because we're not 
differentiating with respect to space but with respect to some more abstract variables. 


First some background. Predator-prey systems describe the interaction between two 
species. We will take our predators to be wolves. (Because they're cool.) We will denote 
the population of wolves at a given time t as w(t). The wolves prey upon something 
cute and furry. We will denote the population of this cute, furry thing as c(t). 


We want to write down a system of differential equations to describe the interaction 
between wolves and cute furry things. The simplest equations were first written down 
by Lotka and Volterra and (after some rescaling) take the form 


dw 

a w(—a + c) 
dc 

dt — c(B = ) 


with a, 8 > 0 are some constants. There is a clear meaning to the different terms in 
these equations. Without food, the wolves die out. That is what the —aw term in 
the first equation is telling us which, if c = 0, will cause the wolf population to decay 
exponentially quickly. In contrast, without wolves the cute furry things eat grass and 
prosper. That's what the +c term in the second equation is telling us which, if w = 0, 
ensures that the population of cute furry things grows exponentially. The second term 
in each equation, +wc, tells us what happens when the wolves and cute furry things 
meet. The + sign means that it's good news for one, less good for the other. 


The Lotka-Volterra equations are straightforward to 
solve. There is a fixed point at c= a and w = B at which w 
the two populations are in equilibrium. Away from this, we 
find periodic orbits as the two populations wax and wane. 
To see this, we think of w — w(c) and write the pair of 
equations as 


c e 


This equation is separable and we have 


These orbits are plotted in the (c, w) plane, also known as the phase plane, for different 


Edo uh. Blogw — w-Falogc — c = constant 


constants in the figure. 


So much for the Lotka-Volterra equations. Let's now look at something more com- 
plicated. Suppose that there is some intra-species competition: a little wolfy bickering 
that sometimes gets out of hand, and some cute, furry in-fighting. We can model this 
by adding extra terms to the original equations: 


dw 

m w(—a +c — pw) 

dc 

ap c(B — w — vc) (4.9) 


where the two new constants are also positive, u,v > 0. Both new terms come with 
minus signs, which is appropriate because fighting is bad. 


What do we do now? There is still a fixed point, now given by (1 + uv)w = B — va 
and (14- uv)c = o-- uB. But what happens away from this fixed point? Do the periodic 
orbits that we saw earlier persist? Or does something different happen? 


Sadly, we can't just solve the differential equation like we did before because it's no 
longer separable. Instead, we're going to need a more creative method to understand 
what's going on. This is where the divergence theorem comes in. We will use it to 
show that, provided u Z 0 or v Æ 0, the periodic orbits of the Lotka-Volterra equation 
no longer exist. 


We first change notation a little. We write the pair of predator-prey equations (4.9) 
in vector form 


da w w(—a + c — uw) 
— =p with a= and p= 
dt c c(B — w — vc) 
Any solution to these equations traces out a path a(t) in the animal phase plane. The 


re-writing above makes it clear that p is the tangent to this path. The question that 
we wish to answer is: does this path close? In other words, is there a periodic orbit? 


cS = 


It turns out that there are no periodic orbits. To show w 
this, we will suppose that periodic orbits exist and then 
argue by contradiction. The normal n to the path a(t) 
obeys n- p = 0, as shown in the figure. This means that if 
we integrate any function b(w, c) around the periodic orbit 


we have 


fep nac o 


By the 2d divergence theorem, this in turn means that the following integral over the 
area enclosed by the periodic orbit must also vanish: 


n : [b(w, c)p] dA = 0 


where, in this context, the gradient operator is V = (0/0w,O0/0c). At this juncture, 
the trick is to find a cunning choice of function b(w,c). The one that works for us is 
b — 1/wc. This is because we have 


"wc c w 

Both of these terms are strictly negative. (For this it is important to remember that 
populations w and c are strictly positive!) But if V - (p/wc) is always negative then 
there's no way to integrate it over a region and get zero. Something has gone wrong. 
And what's gone wrong was our original assumption of closed orbits. We learn that the 
nice periodic solutions of the Lotka- Volterra equations are spoiled by any intra-species 
competition. We're left just with the fixed point which is now stable. All of which is 
telling us that a little in-fighting may not be so bad after all. It keeps things stable. 


The general version of the story above goes by the name of the Bendixson-Dulac 
theorem and is a powerful tool in the study of dynamical systems. 


4.3 Green's Theorem in the Plane 


Let P(x,y) and Q(z,y) be smooth functions on IR2. Then 


ðQ OP E 
l & E 2) dA = f Pac + Qdy (4.10) 


where A is a bounded region in the plane and C = OA is a piecewise smooth, non- 
intersecting closed curve which is traversed anti-clockwise. 
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Proof: Green's theorem is equivalent to the 2d divergence theorem (4.4). Let F — 
(Q, —P) be a vector field in R?. We then have 


.[(9 9P 
[vra] (F 5) 4 (4.11) 


If x(s) = (x(s), y(s)) is the parameterised curve C, then the tangent vector is t(s) = 
(z'(s), y'(s)) and the normal vector n = (y'(s), —z'(s)) obeys n - t. 


You'll need to do a little sketch to convince yourself 
that, as shown on the right, n is the outward pointing nor- y 
mal provided that the arc length s increases in the anti- 
clockwise direction. We then have 


c 
dy da: 
F-n= p% 
n= as Y ds 7 
and so the integral around C is 
[Enas f Pdz + Qdy (4.12) 
p C 


The 2d divergence theorem is the statement that the left-hand sides of (4.11) and (4.12) 
are equal; Green's theorem in the plane is the statement that the right-hand sides are 


equal. 


Applied to a rectangular region, Green's theorem in the 
plane reduces to the fundamental theorem of calculus. We ii 


take the rectangular region to be 0 € x € a and 0 € y € b. 
Then A ^ 


Pan fafat — 
si de( — P(e, b) + PG.) = f Pde 


where only the horizontal segments contribute, and the minus signs are such that C is 


traversed anti-clockwise. Meanwhile, we also have 


TRA = faf ds T 
= [ (Q6. - 06.0) = f 9 


where, this time, only the vertical segments contribute. 
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Figure 14. Don't mind the gap. Green's theorem for an area with disconnected boundaries. 


Green's theorem also holds if the area A has a number of disconnected components, 
as shown in Figure 14. In this case, the integral should be done in an anti-clockwise 
direction around the exterior boundary, and in a clockwise direction on any interior 
boundary. The quickest way to see this is to do the integration around a continu- 
ous boundary, as shown in the right-hand figure, with an infinitesimal gap. The two 
contributions across the gap then cancel. 


An Example 


Let P = xy and Q = zy”. We'll take A to be the region bounded by the parabola 
y? = 4ax and the line x = a, both with —2a < y < 2a. Then Green's theorem in the 
plane tells us that 


fo - aA m | uds etu 
A C 


But this was a problem on the first examples sheet, where you found that both give 


104 4 
the answer iot 


4.3.1 George Green (1793-1841) 


George Green was born in Nottingham, England, the son of a miller. If you were born 
to a family of millers in the 18" century, they didn’t send you to a careers officer at 
school to see what you want to be when you grow up. You'd be lucky just to be sent to 
school. Green got lucky. He attended school for an entire year before joining his father 
baking and running the mill. 


It is not known where Green learned his mathematics. The Nottingham subscription 
library held some volumes, but not enough to provide Green with the background 
that he clearly gained. Yet, from his mill, Green produced some of the most striking 
mathematics of his time, including the development of potential theory and, most 
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importantly, the formalism of Green's functions that you will meet in Section 5, as 
well as in later courses. Much of this was contained in a self-published pamphlet, from 
1828, entitled *An Essay on the Application of Mathematical Analysis to the Theories 
of Electricity and Magnetism". 51 copies were printed. 


Green's reputation spread and, at the age of 40, with no formal education, and 
certainly no Latin or Greek, Green the miller came to Cambridge as a mathematics 
undergraduate, clothes covered in flour and pretending it was chalk. (University motto: 
nurturing imposter syndrome since 1209.) With hindsight, this may not have been the 
best move. Green did well in his exams, but his published papers did not reach the 
revolutionary heights of his work in the mill. He got a fellowship at Caius, developed 
a taste for port, then gout, and died before he reached his 50th birthday. 


There are parallels between Green's story and that of Ramanujan who came to 
Cambridge several decades later. To lose one self-taught genius might be regarded as 
a misfortune. To lose two begins to look like carelessness. 


4.4 Stokes? Theorem 


Stokes' theorem is an extension of Green's theorem, but where the surface is no longer 
restricted to lie in a plane. 


Let S be a smooth surface in IR? with boundary C = OS a piecewise smooth curve. 
Stokes’ theorem states that, for any smooth vector field F(x), we have 


[vss as- | 9 
S c 


The orientations of S and C should be compatible. The former is determined by the 
choice of normal vector n to 5; the latter by the choice of tangent vector t to C. The 
two are said to be compatible if t x n points out of S. In practice, this means that if 
you orient the open surface so that n points towards you, then the orientation of C is 
anti-clockwise. The general set-up is shown in Figure 15. 


Note that there will typically be many surfaces S that share the same boundary 
C. By Stokes’ theorem, the integral of V x F over S must give the same answer for 
all such surfaces. The theorem also holds if the boundary OS consists of a number of 
disconnected components, again with their orientation determined by that of S. 


We'll give a proof of Stokes’ theorem shortly. But first we put it to some use. 
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Figure 15. The surface S and bounding curve C for Stokes’ theorem. The normal to the 
surface is shown (at one point) by the red arrow. The theorem invites us to compute the flux 
of a vector field F, shown by the green arrows, through the surface, and compare it to the 
line integral around the boundary. 


The Meaning of the Curl 


Stokes’ theorem gives us some new intuition for the curl of a vector field. If we integrate 
V x F over a small enough surface such that V x F is approximately constant, then 
we will have 


[ V9: An (V XE) 
S 


where A is the area and n the normal of the surface. Taking the limit in which this 
area shrinks to zero, Stokes’ theorem then tell us that 


1 
n: (V x F) = lim — | F- dx (4.13) 
A—0 C 
In other words, at any given point, the value of V x F in the direction n tells us about 
the circulation of F in the plane normal to n 


A useful benchmark comes from considering the vector field u = w x x, which 
describes a rigid rotation with angular velocity w. (See, for example, the lectures on 
Dynamics and Relativity.) In that case, we have V x u = 2w, so twice the angular 
velocity. 
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Turning this on its head, we can get some in- 
tuition for Stokes’ theorem itself. The curl of the 
vector field tells us about the local circulation of 
F. When you integrate this circulation over some 
surface S, most of it cancels out because the cir- 


culation going one way is always cancelled by a 
neighbouring circulation going the other, as shown 
in the figure. The only thing that's left when you integrate over the whole surface is 
the circulation around the edge. 


A. Corollary: Irrotational Implies Conservative 


Before we prove Stokes’ theorem, we can use it to tie off a thread that we previously 
left hanging. Recall that in Section 3.2, we proved that F = V. = VxF-0, 
but we didn't then have the tools to prove the converse. Now we do. It follows 
straightforwardly from Stokes' theorem because an irrotational vector field, obeying 


f F-dx=0 
C 


around any closed curve C. But we showed in Section 1.2 that any such conservative 


V x F = 0, necessarily has 


field can be written as F = V¢ọ for some potential à. 


An Example 


Let S be the cap of a sphere of radius R that is 
covered by the angle 0 < 0 < a, as shown in the 
figure. We'll take 


F=(0,22z,0) => VxF-(-z,0,z) (414) 


This is the example that we discussed in Section 
2.2.5, where we computed (see (2.11)) 


n x F-dS = TR? cosa sin? a (4.15) 
S 


That leaves us with the line integral around the rim. This curve C is parameterised by 
the angle @ and is given by 


x(¢) = R(sinacos¢,sinasing,cosa) = dx= R(—sin asin Q,sina cos $, 0) dọ 
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We then have 
2m 2m 
f F - dx = f dọ RxzsinacosQ = R? sin? acosa f dd cos? ¢ = r R? sin? a cosa 
C 0 0 


in agreement with the surface integral (4.15). 


Another Example 


As a second example, consider the conical surface S defined by z? = z? + y? with 
0 «€ a € z € b. This surface is parameterised, in cylindrical polar coordinates, by 


x(p, $) = (pcos 4, psin ¢, p) (4.16) 


with a < p < b and 0 € @ < 27. We can compute 
two tangent vectors 


s = (cos ġ,sin ġ,1) and x = p(— sin ¢, cos ¢, 0) 
and take their cross product to get the normal 


Ox Ox . 
n = Op x Od m (—p cos ¢, —p sin $, p) 


This points inwards, as shown in the figure. The 
associated vector area element is 


dS = (— cos $, sin $, 1)pdp dọ 
We'll integrate the same vector field (4.14) over this surface. We have 
V x F- dS = (xcos¢ + z)p dp dọ = p? (cos? ¢ + 1)dp do 


where we've substituted in the parametric expressions for x and z from (4.16). The 
integral is then 


b 2v 
[vxe-as= f do | do p?(1 + cos? ¢) = 1(b? — a?) (4.17) 
S a 0 
Now the surface has two boundaries, and we must integrate over both of them. We 


write OS = C, — C, where C; has radius b and C, radius a. Note the minus sign, 
reflecting the fact that the orientation of the two circles is opposite. 
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For a circle of radius R, we have x(¢) = R(cos ¢, sing, 1), and so dx = R(— sin ¢, cos ¢, 0) 
and 


2n 
| P-dx= | do R3cos?$ = r R? 
Cn 0 


Remembering that the orientation of C, in the opposite direction, we reproduce the 
surface integral (4.17). 
4.4.1 A Proof of Stokes’ Theorem 


It's clear that Stokes' theorem is a version of Green's theorem in the plane, but 
viewed through 3d glasses. Indeed, it's trivial to show that the latter follows from 
the former. Consider the vector field F = (P,Q,0) in IR? and a surface S that 
lies flat in the z = 0 plane. The normal to this surface is n = z, and we have 


QU OP 
vxr.as- f (2-2) dS 
f g \ Ox Oy 


But Stokes’ theorem then tells us that this can also 
be written as 


] 5 | Pas Quy 
c c 


However, with a little more work we can also show that the converse is true. In other 
words, we can lift Green’s theorem out of the plane to find Stokes’ theorem. 


Consider a parameterised surface S defined by x(u, v) and denote the associated area 
in the (u,v) plane as A. We parameterise the boundary C = OS as x(u(t), v(t)) and 
the corresponding boundary OA as (u(t), v(t)). The key idea is to use Green's theorem 
in the (u, v) plane for the area A and then uplift this to prove Stokes theorem for the 
surface S. 


We start by looking at the integral around the boundary. It is 


[re [o e Fa) = | Fi,du+ F, dv 
C C Qu Ov 8A 


where F, = F - 0x/Ou and F, = F - Ox/Ov. Now we're in a position to invoke Green's 


| F,du Fdo | (S-F) dA 
OA A Ou Ov 
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theorem, in the form 


Now our task is clear. We should look at the partial derivatives on the right hand side. 
We just need to be careful about what thing depends on what thing: 


ðF, ð = OxN ə i Oy fOBjOr NOx | P o?r 

ðu ðu Ov] ðu\ v) \dri du) Ov *'OuOv 
Meanwhile, we have 

OR. ð T ðx\ ð 2 On’. (OF OF Or i5 o?r’ 

ðv Ov Ou] Ov\ ðu) \dxi ðv J Ou * OvOu 


Subtracting the second expression from the first, the second derivative terms cancel, 


leaving us with 


a» ax! OF, 
Qu Ov Oz? 


OF, 7 OP Ox! Ox' (OF; E OF; 
Ou Ov ðu Ov \ Oxi Əri 


) = Ondu — iða) 
At this point we wield everyone’s favourite index notation identity 
EjipEpkl = ÓjxÓii = Ó jii 


We then have 


OF, B OF, Ox* Ox! OF; Ox x) 


ow og Pop ou gu RUE, (5. un 


Now we're done. Following through the chain of identities above, we have 


OF, OF, 
[re [E — jc) dude 
Ox Ox 


= [veas 


This is Stokes’ theorem. 


4.4.2 George Gabriel Stokes (1819-1903) 


Stokes was born in County Sligo, Ireland, but moved to Cambridge shortly after his 19t 
birthday and remained there for the next 66 years, much of it as Lucasian professor. He 
contributed widely to different area of mathematics and physics, with the Navier-Stokes 
equation, describing fluid flow, a particular highlight. 


CERT = 


8. If X, Y, Z be functions of the rectangular co-ordinates x, y, z, d$ an element of any 
limited surface, /, m, n the cosines of the inclinations of the normal at dS to the axes, ds 
an element of the bounding line, shew that 


dZ dY dX dZ dY dX 
ff l + nf ) +n dS 
dy dx dz dx dx dy 


Í xy 947% is, 
ds ds ds 


the differential coefficients of X, Y, Z being partial, and the single integral being taken all 
round the perimeter of the surface. 


Figure 16. You may now turn the page... the original version of Stokes’ theorem, set as an 


exam question. 


What we now call Stokes’ theorem was communicated 
to Stokes by his friend William Thomson, better known by 
his later name Lord Kelvin. The theorem first appeared 
in print in 1854 as part of the Smith's prize examination 
competition, a second set of exams aimed at those students 
who felt the Tripos wasn't brutal enough. 


If you're in Cambridge and looking for a tranquil place 
away from the tourists to sit, drink coffee, and ponder the 


wider universe, then you could do worse than the Mill Road 

cemetery, large parts of which are overgrown, derelict, and 

beautiful. Stokes is buried there, as is Cayley, although both gravestones were destroyed 
long ago. You can find Stokes' resting place nestled between the graves of his wife and 
daughter!. 


4.4.3 An Application: Magnetic Fields 


Consider an infinitely long wire carrying a current. What is the magnetic field that is 
produced? We can answer this by turning to the Maxwell equations (3.7). For time 
independent situations, like this, one of the equations reads 


V x B = pod (4.18) 


TA long, tree lined avenue runs north off Mill Road. At the end, turn right to enter the cemetery. 
There is a gravel path immediately off to your left, which you should ignore, but take the first mud 
track that runs parallel to it. Just after the gravestone bearing the name “Frederick Cooper” you will 
find the Stokes' family plot. 
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where J is the current density and pip is a constant of nature that determines the 
strength of the magnetic field and has some pretentious name that I can never remem- 
ber. Another of the Maxwell equations reads V-B = 0 and in most situations we should 
solve this in conjunction with (4.18) but here it will turn out, somewhat fortuitously, 
that if we just find the obvious solution to (4.18) then it solves V-B = 0 automatically. 


The equation (4.18) provides a simple opportunity to use Stokes’ theorem. We inte- 
grate both sides over a surface S that cuts through the wire, as shown in the figure to 
the right. We then have 


[v»8:48- | B-dx= po [ 3:48 = pot 
S c S 


where the integral of the current density gives J, 
the total current through the wire. This equa- 
tion tells us that there must be a circulation of 
the magnetic field around the wire. In particular, 
there must be a component of B that lies tangent 
to any curve C that bounds a surface S. 


Let's suppose that the wire lies in the z- 
direction. (Rotate your head or your screen if you 
don't like the z direction to be horizontal.) Then 
if S is a disc of radius p, then the boundary C = OS is paramterised by the curve 


Ox 
ð 


x = p(cos¢,sing,0) => t = p(— sin $, cos ¢, 0) 


We'll make the obvious guess that B lies in the same direction as t and work with the 
ansatz 


B(x) = b(p)(— sin ¢, cos $, 0) 


Then B -t = pb(p). Provided that p is bigger than the radius of the wire, Maxwell's 
equation tells us that 


27 I 
uol = f B - dx = f dọ pb(p = B(x)= Pot (sing, cos @, 0) 
[e 0 2p 
You can check that this answer also satisfies the other Maxwell equation V - B = 0. 


We learn that the magnetic field circulates around the wire, and drops off as 1/p with 
p the distance from the wire. 
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4.4.4 Changing Coordinates Revisited 


Back in Section 3.3, we wrote down the expressions for the divergence and curl in a 
general orthonormal curvilinear coordinate system. Now we can offer a proof using the 
integral theorems above. 


Claim: The divergence of a vector field F(u, v, w) in a general orthogonal, curvilinear 
coordinate system is given by 


1 o o o 
-E = — —— | —(hyhy Fa) + = (huhw Fo) + —(Ruho Fw 4.19 
y huhuhu E Pay Fu) + a J au i ) 
Proof: We sketch a proof that works with the integral definition of the divergence 
(4.2), 
1 eu 
V.F- lim 7 [Fas 
v0 V Js 
We can take the volume V to consist of a small e, 


cuboid at point (u, v, w) with sides parallel to the ba- 

sis vectors e,, e, and e,. The volume of the cube ^ 
is h,h,h,dudv dw. Meanwhile, the area of, say, the j 
upper face in the figure is roughly h,h,dudv. Since h, and h, may depend on the 
coordinates, this could differ from the area of the lower face, albeit only by a small 


amount dw. Then, assuming that F is roughly constant on each face, we have 


i F-dS x [uly Fu, v,w + dw) — hyhyFw(u, v, w)|dudv + two more terms 
E 
ð 


x — (hh, F.,)óuóvów + two more terms 


Ow 
Dividing through by the volume then gives us the advertised expression for V - F. 


Claim: The curl of a vector field F(u, v, w) in a general orthogonal, curvilinear coor- 
dinate system is given by 


1 h.e, he, hey 
= ð ð [o] 
VxF-— hihyhw ðu ðv ðw 
hy E, fifa lafu 


1 ð ð m 
= (ts) = ac F2) + two similar terms 


— 00 — 


Proof: This time we use the integral definition of curl 
(4.13) 


1 
n: (V x E) = lim 4 J Pax 


We'll take a surface S with normal n = e,, and in- 
tegrate over a small region, bounded by one of the 
squares in the figure on the right. The area of the square h,h,óuóv while the length 
of each side is ^,,óu and h,dv. Assuming that the square is small enough so that F is 
roughly constant along any given side, we have 


f F - dx = hF,(u,v)óu + h,F,(u-4- óu,v)óv — huFulu, v + óv)óu — hi Fy (u, v)óv 
C 


a a 
x PACIS = 5 (hu Fu) óv 


Dividing by the area, this gives 


ð ð 
ew: VxF-— I5; (he) — 5, E) ðu ôv 


which is one of the three promised terms in the expression for V x F. 
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5 The Poisson and Laplace Equations 


Until now, our focus has been very much on understanding how to differentiate and 
integrate functions of various types. But, with this under our belts, we can now take 
the next step and explore various differential equations that are written in the language 
of vector calculus. Our goal in this section is to find solutions to the Poisson equation 
and the related Laplace equation. This we will do in Section 5.2. But first we will 
explain why these equations underly two of the most important forces in the universe. 


5.1 Gravity and Electrostatics 


The first two fundamental forces to be discovered are also the simplest to describe 
mathematically. Newton's law of gravity states that two masses, m and M, separated 
by a distance r will experience a force 
GMm , 
F(r)--— r (5.1) 


r2 


with G Newton's constant, a fundamental constant of nature that determines the 
strength of the gravitational force. Meanwhile, Coulomb's law states that two elec- 
tric charges, q and Q, separated by a distance r will experience a force 
Qq . 

r 


Ategr? 


F(r) 


(5.2) 


with the electric constant eg a fundamental constant of nature that determines the 
inverse strength of the electrostatic force. The extra factor 47 reflects the fact that in 
the century between the Newton and Coulomb people had figured out where factors of 
4r should sit in equations. 


Most likely it wil not have escaped your attention that these two equations are 
essentially the same. The only real difference is that overall minus sign which tells 
us that two masses always attract while two like charges repel. The question that we 
would like to ask is: why are the forces so similar? 


Certainly it's not true that there is a deep connection between gravity and the elec- 
trostatic force, at least not one that we've uncovered to date. In particular, when 
masses and charges start to move, both the forces described above are replaced by 
something different and more complicated — general relativity in the case of gravity, 
the full Maxwell equations (3.7) in the case of the Coulomb force — and the equations 
of these theories are very different from each other. Yet, when we restrict to the simple, 
static set-up, the forces take the same form. 
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The reason for this is twofold. First, both forces are described by fields. Second, 
space has three dimensions. The purpose of this section is to explain this in more 
detail. And, for this, we need the tools of vector calculus. 


5.1.1 Gauss? Law 


Each of the force equations (5.1) and (5.2) contains some property that characterises 
the force: mass for gravity and electric charge for the electrostatic force. For our 
purposes, it will be useful to focus on one of the particles that carries mass m and 
charge q. We call this a test particle, meaning that we'll look at how this particle is 
buffeted by various forces but won't, in turn, consider its effect on any other particle. 
Physically, this is appropriate if m « M and q « Q. Then it is useful to write the 
equation in a way that separates the properties of the test particle from the other. The 
force experienced by the test particle is 


F(x) = mg(x) + qE(x) 


where g(x) is the gravitational and E(x) is the electric field. Clearly Newton’s law is 
telling us that a particle of mass M sets up a gravitational field 


g(x) = -— s (5.3) 
while a particle with electric charge Q sets up an electric field 


Q . 


= r 
Ategr? 


E(x (5.4) 
So far this is just a trivial rewriting of the force laws. However, we will now reframe 
these force laws in the language of vector calculus. Instead of postulating the 1/r? force 
laws (5.3) and (5.4), we will replace them by two properties of the fields from which 
everything else follows. Here we specify the first property; the second will be explained 
in Section 5.1.2. 


The first property is that if you integrate the relevant field over a closed surface, then 
it captures the amount of “stuff” inside this surface. For the gravitational field, this 
stuff is mass 


n -dS = —44GM (5.5) 
S 
while for the electric field it is charge 
f E.ds - € (5.6) 
S €0 
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Again, the difference in minus sign signals the important attractive/repulsive difference 
between the two forces. In contrast, the factors of 4 G and 1/eo are simply convention 
for how we characterise the strength of the fields. These two equations are known as 
Gauss’ law. Or, more precisely, “Gauss’ law in integrated form”. We’ll see the other 
form below. 


Examples 


For concreteness, let’s focus on the gravitational field. We will take a sphere of radius 
R and total mass M. We will require that the density of the sphere is spherically 
symmetric, but not necessarily constant. The spherical symmetry of the problem then 
ensures that the gravitational field itself is spherically symmetric, with g(x) = g(r)r. 
If we then integrate the gravitational field over any spherical surface S of radius r > R, 
we have 


[ea = [amas = 4nr?g(r) 


2 


where we recognise 47zr^ as the area of the sphere. From 


Gauss’ law (5.5) we then have 


g(r) 2 ——-ft (5.7) 


This reproduces Newton’s force law (5.1). Note, however, 
that we’ve extended Newton’s law beyond the original re- 


mit of point particles: the gravitational field (5.7) holds for 

any spherically symmetric distribution of mass, provided that we’re outside this mass. 
For example, it tells us that the gravitational field of the Earth (at least assuming 
spherical symmetry) is indistinguishable from the gravitational field of a point-like par- 
ticle with the same mass, sitting at the origin. This way of solving for the vector field 
is known as the Gauss flux method. 


Another rather cute consequence of this is that, at least for spherically symmetric 
mass distributions, you don’t feel the mass outside you. According to Gauss’ law, the 
gravitational field at any point is determined only by what lies inside a sphere of a 
given radius. So if, for example, you were able to hollow out the centre of a planet 
(unlikely, admittedly) then anyone living there would feel no gravitational force from 
the mass that surrounds them. 
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For our second example, we turn to the electric field. Consider an infinite line of 
charge, with charge per unit length c. This situation is crying out for cylindrical polar 
coordinates. Until now, we've always called the radial direction in cylindrical polar 
coordinates p but, for reasons that will become clear shortly, for this example alone 
we will instead call the radial direction r as shown in the figure. The symmetry of 
the problem shows that the electric field is radial so takes the form E(r) — E(r)f. 
Integrating over cylinder S of radius r and length L we have 


f E - dS = 2rrLE (r) 
S 


where there is no contribution from the end caps because 


n- E — 0 there, with n the normal vector. The total charge n 
inside this surface is Q = oL. From Gauss’ law (5.6), we 7 
then have the electric field 
o Mi 
E(r) = 5 r Ne iet 
T€or jars cea 


Note that the 1/r behaviour arises because the symmetry 


of the problem ensures that the electric field lies in a plane. S 
Said differently, the electric field from an infinite charged 

line is the same as we would get from a point particle in a 

flatland world of two dimensions. 


More generally, if space were IR", then the Gauss' law equations (5.5) and (5.6) would 
still be the correct description of the gravitational and electric fields. Repeating the 
calculations above would then tell us that a point charge gives rise to an electric field 


1 
E(r) = —— ———if 
(r) Anere l" 
where A,r" is the “surface area” of an n-dimensional sphere S" of radius r. (For what 
it’s worth, the prefactor is A, 4 = 22"//T(n/2) where T(x) is the gamma function 
which coincides with the factorial function I(x) = (x — 1)! when z is integer.) For the 
rest of this section, we'll keep our feet firmly in IR?. 


Gauss! Law Again 


There's a useful way to rewrite the Gauss' law equations (5.5) and (5.6). For the 
gravitational field, we introduce the density, or mass per unit volume, p(x). Invoking 
the divergence theorem then, for any volume V bounded by S, we have 


] visa = [ads ani = -G | pav 
V S V 
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But, rearranging, we have 


f (v "ge 4nGip(x)) dV = 0 

V 

for any volume V. This can only hold if the integrand itself vanishes, so we must have 
V -g = —AnGp(x) (5.8) 


This is also known as Gauss’ law for the gravitational field, now in differential form. The 
equivalence with the earlier integrated form (5.5) follows, as above, from the divergence 
theorem. 


We can apply the same manipulations to the electric field. This time we introduce 
the charge density pe(x). We then get Gauss’ law in the form 


V.E- en (5.9) 


This is the first of the Maxwell equations (3.7). (In our earlier expression, we denoted 
the charge density as p(x). Here we've added the subscript pe to distinguish it from 
mass density.) The manipulations that we've described above show that Gauss' law is 
a grown-up version of the Coulomb force law (5.2). 


5.1.2 Potentials 


In our examples above, we used symmetry arguments to figure out the direction in 
which the gravitational and electric fields are pointing. But in many situations we 
don’t have that luxury. In that case, we need to invoke the second important property 
of these vector fields: they are both conservative. 


Recall that, by now, we have a number of different ways to talk about conservative 
vector fields. Such fields are necessarily irrotational V x g = V x E = 0. Furthermore, 
their integral vanishes when integrated around any closed curve C, 


faxo B-dx=0 
C C 


You can check that both of these hold for the examples, such as the 1/r? field, that we 
discussed above (as long as the path C avoids the singular point at the origin). 
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Here the key property of a conservative vector field is that it can be written in terms 
of an underlying scalar field, 


g=-V® and E=-V¢ (5.10) 


where ®(x) is the gravitational potential and ó(x) the electrostatic potential. Note 
the additional minus signs in these definitions. We saw in the discussion around (1.18) 
that the existence of such potentials ensures that test particles experiencing these forces 
have a conserved energy: 


1 
energy — gm + m®(x) + qó(x) 


Combining the differential form of the Gauss’ law (5.8) and (5.9) with the existence of 
the potentials (5.10), we find that the gravitational and electric fields are determined, 
in general, by solutions to the following equations 

lm 

Vb=te). and Pha 

€0 
Equations of this type are known as the Poisson equation. In the special case where 
the “source” p(x) on the right-hand side vanishes, this reduces to the Laplace equation, 
for example 


V? =0 
These two equations are commonplace in mathematics and physics. Here we have 


derived them in the context of gravity and electrostatics, but their applications spread 
much further. 


To give just one further example, in Fluid Mechanics the motion of the fluid is 
described by a velocity field u(x). If the flow is irrotational, then V x u = 0 and the 
velocity can be described by a potential function u = V. If, in addition, the fluid 
is incompressible then V - u = 0 and we once again find ourselves solving the Laplace 
equation V7¢ = 0. 


5.2 The Poisson and Laplace Equations 


In the rest of this section we will develop some methods to solve the Poisson equation. 
We change notation and call the potential y(x) (to avoid confusion with the polar angle 
$). We are then looking for solutions to 


V^v(x) = —p(x) 
The goal is to solve for w(x) given a “source” p(x). As we will see, the domain in which 


w(x) lives, together with associated boundary conditions, also plays an important role 
in the determining y(x). 
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The Laplace equation V?^v = 0 is linear. This means that if (x) is a solution 
and w(x) is a solution, then so too is v4(x) + v5(x). Any solution to the Laplace 
equation acts as a complementary solution to the Poisson equation. This should then 
be accompanied by a particular solution for a given source p(x) on the right-hand side. 


5.2.1 Isotropic Solutions 

Both the Laplace and Poisson equations are partial differential equation. Life is gen- 
erally much easier if we're asked to solve ordinary differential equations rather than 
partial differential equations. For the Poisson equation, this is what we get if we have 
some kind of symmetry, typically one aligned to some polar coordinates. 


For example, if we have spherical symmetry then we can look for solutions of the 
form w(x) = y(r). Using the form of the Laplacian (3.15), Laplace equation becomes 


Py 2d 14d (eM) =o 


2h c 
Vorst dr? rdr | r?dr j dr 
A 
> w()---«B (5.11) 


for some constants A and B. Clearly the A/r solution diverges as r — 0 so we should 
be cautious in claiming that this solves the Laplace equation at r — 0. (We will shortly 
see that it doesn't, but it does solve a related Poisson equation.) Note that the solution 
A/r is relevant in gravity or in electrostatics, where v(r) has the interpretation as the 
potential for a point charge. 


Meanwhile, in cylindrical polar coordinates we will also denote the radial direction as 
r to avoid confusion with the source p in the Poisson equation. The Laplace equation 


becomes 
d 1d 1d dy 
2 T A 
ui S dr? oe ara (Te) = 
=> w(r)- Alogr-- B (5.12) 


This again diverges at r = 0, this time corresponding to the entire z axis. 


Note that if we ignore the z direction, as we have above, then cylindrical polar coor- 
dinates are the same thing as 2d polar coordinates, and the log form is the rotationally 
invariant solution to the Laplace equation in R?. In general, in R”, the non-constant 
solution to the Laplace equation is 1/r”~?. The low dimensions of R? and R are special 
because the solution grows asymptotically as r — oo, while for R” with n > 3, the 
rotationally invariant solution to the Laplace equation decays to a constant asymptot- 
ically. 
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If w(r) is a solution to the Laplace equation, then so too is any derivative of y(r). 
For example, if we take the spherically symmetric solution v(r) — 1/r, then we can 
construct a new solution 


Waipole(X) =d-V (1) m d -X 


r r3 


for any constant vector d and, again, with r Z 0. This kind of solution in important in 
electrostatics where it arises as the large distance solution for a dipole, two equal and 
opposite charges at a fixed distance apart. 


Discontinuities and Boundary Conditions 


In many situations, we must specify some further data when solving the Poisson equa- 
tions. Typically this is some kind of boundary condition and, in some circumstances, 
a requirement of continuity and smoothness on the solution. 


This can be illustrated with a simple example. Suppose that we are looking for a 
spherically symmetric solution to: 


— «R 
Vp = PO EiS 
0 r>R 


with pg constant. We will further ask that y(r = 0) is non-singular, that y(r) — 0 
as r — oo, and that v(x) and v/(x) are continuous. We will now see that all of these 
conditions give us a unique solution. 


First look inside r € R. As we mentioned above, a solution to the Poisson equation 
can be found by adding à complementary solution and a particular solution. Since 
we're looking for a spherically symmetric particular solution, we can restrict our ansatz 
to u(r) = r? for some p. It’s simple to check that V?r? = p(p + 1)r??. This then gives 
us the general solution 


A 1 
V(r) = — + B = por” r<R 
r 


But now we can start killing some terms by invoking the boundary conditions. In 
particular, the requirement that v(r) is non-singular at r = 0 tells us that we must 
have A = 0. Meanwhile, outside r > R the most general solution is 


yr)=2+D 


— 99 — 


Figure 17. The plot of 6 = —47Gy on the left, with the radius R = 1 the cross over point. 
This is more apparent in the gravitational field g = —®’ shown on the right. 


Now we must have D = 0 if y(r) — 0 as r > oo. To finish, we must patch these two 
solutions at r — R, invoking continuity 


C 
U(r = R) = B- zo = 5 
and smoothness 
] 1 C 
y(r = R)= =R UR 


These determine our last two unknown constants, B and C. Putting this together, we 
have a unique solution 


b(n) ip(3R?) —r?) r<R 
p) = 
IpoR?/r r>R 


This example has application for the gravitational potential 6 = —47Gw of a planet 
of radius R and density po. The plot of ® is shown on the left of Figure 17; the plot of 
the gravitational field g = —d®/dr is on the right, where we see a linear increase inside 
the planet, before we get to the more familiar 1/r? fall-off. 


5.2.2 Some General Results 


So far our solutions to the Poisson equation take place in IR?. (Or, more precisely, 
R? — {0,0} for the 1/r solution (5.11) and R? — R for the logr solution (5.12).) In 
general, we may want to solve the Poisson or Laplace equations V?v = —p in some 
bounded region V. In that case, we must specify boundary conditions on OV. 


There are two common boundary conditions: 


e Dirichlet condition: We fix v(x) = f(x) for some specific f(x) on OV. 
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e Neumann condition: We fix n: V (x) = g(x) for some specific g(x) on OV, where 
n is the outwardly pointing normal of OV. 


The Neumann boundary condition is sometimes specified using the slightly peculiar 
notation Qv /On :— n - Vw. Or even, sometimes, y/n. We have the following state- 
ment of uniqueness: 


Claim: There is a unique solution to the Poisson equation on a bounded region V, 
with either Dirichlet or Neumann boundary conditions specified on each boundary OV. 
(In the case of Neumann boundary conditions everywhere, the solution is only unique 
up to a constant.) 


Proof: Let w(x) and v»(x) both satisfy the Poisson equation with the specified 
boundary conditions. Then w(x) = wv, — Y2 obeys V?w = 0 and either v = 0 or 
n - Vy = 0 on OV. Then consider 


[v evoar- | (Và - Vb + VU) av= f |vu pav 
V V V 


But by the divergence theorem, we have 


] v: voa = | vods - f Y(n: Va) dS =0 
V OV OV 


where either Dirichlet or Neumann boundary conditions set the boundary term to zero. 
Because |Vw|? > 0, the integral can only vanish is Vy = 0 everywhere in V, so Y must 
be constant. If Dirichlet boundary conditions are imposed anywhere, then that constant 


must be zero. 


This result means that if we can find any solution — say an isotropic solution, or 
perhaps a separable solution of the form (x) = e(r)Y (0) — then this must be the 
unique solution. By considering the limit of large spheres, it is also possible to extend 
the proof to solutions on R3, with the boundary condition v (x) — 0 suitably quickly 
as T — oo. 


Note, however, that this doesn't necessarily tell us that a solution exists. For ex- 
ample, suppose that we wish to solve the Poisson equation V?^v = p(x) with a fixed 
Neumann boundary condition n- Vw = g(x) on OV. Then there can only be a solution 
provided that there is a particular relationship between p and g, 


] vs - f Vy-dS <> [ow = f ads 
V OV V S 


In other situations, there may well be other requirements. 
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If the region V has several boundaries, it's quite possible to specify a different type 
of boundary condition on each, and the uniqueness statement still holds. This kind of 
problem arises in electromagnetism where you solve for the electric field in the presence 
of a bunch of “conductors” (for now, conductors just means a chunk of metal). The 
electric field vanishes inside a conductor since, of it didn't the electric charges inside 
would move around until the created a counterbalancing field. So any attempt to solve 
for the electric field outside the conductors must take this into account by imposing 
certain boundary conditions on the surface of the conductor. It turns out that both 
Dirichlet and Neumann boundary conditions are important here. If the conductor 
is “grounded”, meaning that it is attached to some huge reservoir of charge like the 
Earth, then then it sits at some fixed potential, typically i = 0. This is a Dirichlet 
boundary condition. In contrast, if the conductor is isolated and carries some non- 
vanishing charge then it will act as a source of electric field, but this field is always 
emitted perpendicular to the boundary. This, then, specifies n: E = —n- Vy, giving 
Neumann boundary conditions. You can learn more about this in the lectures on 
Electromagnetism. 


Green's Identities 


The proof of the uniqueness theorem used a trick known as Green's (first) identity, 
namely 


[ovva o - | vo-voav » f ove as 


This is essentially a 3d version of integration by parts and it follows simply by applying 
the divergence theorem to QV». We used it in the above proof with ¢ = «, but the 
more general form given above is sometimes useful, as is a related formula that follows 
simply by anti-symmetrisation, 


f ($V? — 9v?4) dV = f (OVb — YVG) -dS 
V S 


This is known as Green's second identity. 


Harmonic Functions 


Solutions to the Laplace equation 
Vs 


arise in many places in mathematics and physics. These solutions are so special that 
they get their own name: they are called harmonic functions. Here are two properties 


—102- 


of these functions 


Claim: Suppose that w is harmonic in a region V that includes the solid sphere 
with boundary Sg: |x — a| = R. Then the value of w at a, the centre of the sphere, is 
given by v(a) = v(R) where 


- 1 
= —_ S 
OR) = Lus, 69d 
is the average of Y over Sr. This is known as the mean value property. 


Proof: In spherical polar coordinates centred on a, the area element is dS = r° sin 0d0 d, 
so 


y(r) = 7, | fe sin 0 v(r, 0, p) 


and 


a(R) _ JEDE, e LAU = l GUE) ds 


dr Or | 4xR? Or 


1 i) 
= Vy -dS = V^ dV = 0 
An R? Sn Ball 


But clearly Y(R) > v(a) as R > 0 so we must have Y(R) = v(a) for all R. 


Claim: A harmonic function can have neither a maximum nor minimum in the in- 
terior of a region V. Any maximum of minimum must lie on the boundary OV. 


Proof: If has a local maximum at a in V then there exists an e such that w(x) < v(a) 
for all x — a| < e. But, we know that (R) = v (a) and this contradicts the assumption 
for any 0 « R « c. 


This is consistent with our standard analysis of maxima and minima. Usually we 
would compute the eigenvalues A; of the Hessian 0? /Ox'O0x?. For a harmonic function 
V? = 0? /0x*0x* = 0. Since the trace of the Hessian vanishes, we must have eigen- 
values of opposite sign since $7; A; = 0. Hence, any stationary point must be a saddle. 
Note that this standard analysis is inconclusive when A; — 0, but the argument using 
the mean value property closes this loophole. 
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5.2.3 Integral Solutions 


There is a particularly nice way to write down an expression for the general solution 
to the Poisson equation in R, with 


V^y = —p(x) (@e) 
at least for a localised source p(x) that drops off suitably fast, so p(x) — 0 as r > oo. 


To this end, let's look back to what is, perhaps, our simplest “solution”, 


v(x) = : (5.13) 


ar 


for some constant A. The question we want to ask is: what equation does this actually 
solve?! We've seen in (5.11) that it solves the Laplace equation V?v = 0 when r Æ 0. 
But clearly something’s going on at r = 0 because the function diverges there. In the 
language of physics, we would say that there is a point particle sitting at r = 0, carrying 
some mass or charge, giving rise to this potential. What is the correct mathematical 
way of capturing this? 


To see that there must be something going on at r = 0, let’s replay the kind of Gauss 
flux games that we met in Section 5.1. We integrate V?i, with w given by (5.13), over 
a spherical region of radius R, to find 


[view - [vo - — 


Comparing to (€9), we see that the function (5.13) must solve the Poisson equation 
with a source and this source must obey 


f (9v => 


This makes sense physically, since f pdV is the total mass, or total charge, which does 
indeed determine the overall scaling À of the potential. But what mathematical function 
obeys p(x) = 0 for all x Z 0 yet, when integrated over all space, gives a non-vanishing 
constant A? 


The answer is that p(x) must be proportional to the 3d Dirac delta function, 


p(x) = à 0 (x) 
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The Dirac delta function should be thought of as an infinitely narrow spike, located at 
the origin. It has the properties 


ó(x)—0 fo x40 


and, when integrated against any function f(x) over any volume V that includes the 
origin, it gives 


1 f(x) (x) dV = f(x = 0) 
V 


The superscript in ó?(x) is there to remind us that the delta function should be in- 
tegrated over a 3-dimensional volume before it yields something finite. In particular, 
when integrated against a constant function, we get a measure of the height of the 
spike, 


[ 00 dV=1 


The Dirac delta function is an example of a generalised function, also known as a 
distribution. And it is exactly what we need to source the solution v» ~ 1/r. We learn 
that the function (5.13) is not a solution to the Laplace equation, but rather a solution 
to the Poisson equation with a delta function source 


Vj =x) > pa) Kc (5.14) 


= Arr 


With this important idea in hand, we can now do something quite spectacular: we can 
use it to write down an expression for a solution to the general Poisson equation. 


Claim: The Poisson equation (@¢) has the integral solution 


v(x) = Ji pex) ay’ (5.15) 


An Jy: |x — x’| 


where the integral is over a region V’ parameterised by x’. 


Proof: First, some simple intuition behind this formula. A point particle at x’ gives 
rise to a potential of the form v (x) = p(x’)/4a|x —x’|, which is just our solution (5.14), 
translated from the origin to point x’. The integral solution (5.15) then just takes ad- 
vantage of the linear nature of the Poisson equation and sums a whole bunch of these 
solutions. 
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The technology of the delta function allows us to make this precise. We can evaluate 


1 ; 1 
m i [rnv (goa) m 


where you have to remember that V? differentiates x and cares nothing for x'. We then 
have the result 


1 
2 — An 3 (x — x) 
7 


which is just a repeat of (5.14), but with the location of the source translated from the 
origin to the new point x’. Using this, we can continue our proof 


vis =— f p(x’) xx) dv" = -p(x) 


which is what we wanted to show. 


The technique of first solving an equation with a delta function source and sub- 
sequently integrating to find the general solution is known as the Green's function 
approach. It is a powerful method to solve differential equations and we will meet it 
again in many further courses. 
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6 Tensors 
A famously annoying definition of a tensor is: 
A tensor is something whose components transforms like a tensor 


'This becomes even more annoying when you appreciate that this is, in fact, one of 
the better definitions of a tensor. The purpose of this section is to explain why this 
definition is not as dumb as it sounds and to give some insight into what it means to 
be a tensor. 


Very roughly speaking, tensors are generalisations of objects like vectors and matri- 
ces. In index notation, a vector has a single index while a matrix has two indices. A 
tensor is an object with any number of indices, something like Tij... 


However, this simplistic description hides the most important property of a tensor. 
Vectors, matrices and, more generally, tensors are more than just a list of numbers. 
Instead, those numbers should be thought of as a useful way of characterising the un- 
derlying object and, because of this, inherit some properties of that underlying object. 
As we will see, the key property is how the list of numbers transform under a change 
of basis. 


We will start by explaining this in more detail, firstly with vectors and then building 
up to the definition of a tensor. Initially we will keep the discussion restricted to some 
(admittedly rather dry) mathematical formalism. Then, in Section 6.2 we will describe 
some physical examples. 


6.1 What it Takes to Make a Tensor 


Not any list of n numbers constitutes a vector in IR". Or, said more precisely, not any 
list of n numbers constitutes the components of a vector in IR". For example, if you 
write down the heights of the first three people you met this morning, that doesn't make 
a vector in IR?. Instead, a vector comes with certain responsibilities. In particular, the 
components describe an underlying object which should be independent of the choice 
of basis. As we now explain, that means that the components should transform in the 
right way under rotations. 


We consider a point x € IR". If we wish to attach some coordinates to this point, we 
first need to introduce a set of basis vectors {e;} with i — 1,...,n. We will take these 
to be orthonormal, meaning that e; e; = 6;;. Any vector can then be expressed as 
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Usually we conflate the components z; = (24,...,2,4) with the “vector”. But, for our 
purposes, we should remember that these are just a useful way of representing the more 
abstract object x. In particular, we're entirely at liberty to take a different set of basis 
vectors, 


e; = Rije; 
If we ask that e; are also orthonormal, so e; - e; = ĝ;j, then we have 
e ; e; = RikRjı ek: € = Rik Pjk = On 
or, in matrix notation, 
BH -— 


Matrices of this kind are said to be orthogonal. We write R € O(n). Taking the 
determinant, we have detR = +1. Those matrices with detR = +1 correspond to 
rotations and are said to be special orthogonal. We write R € SO(n). In R, a 
rotation R € SO(3) takes a right-handed orthonormal basis into another right-handed 
orthonormal basis. Those matrices with detR = —1 correspond to a rotation together 
with a reflection and take a right-handed basis to a left-handed basis. 


Under a change of basis, the vector x itself doesn't change. But its components do. 
We have 


X Tei re. x, Rize; 
So the components transform under the same rotation matrix R, 
= / = 
Tj = Rijt; > t= Riti (6.2) 


A tensor T is a generalisation of these ideas to an object with more indices. Just as 
the vector x has an identity independent of any choice of basis, so too does the tensor 
T. But when measured with respect to a chosen basis {e;}, a tensor of rank p has 
components 7;,..;,. When we change the basis using (6.1), the tensor transforms as 


qu ces cessi e (6.3) 


i1. p 


This is known as the tensor transformation rule. A tensor of rank p is sometimes 
referred to simply as a p-tensor. 
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The simplest examples of tensors are very familiar. A tensor of rank 0 is just a 
number, or scalar, T. Here there's no requirement because a number doesn't change if 
you do a rotation: 7" — T'. So any single number can be said to be a tensor, although 
it isn't a particularly helpful designation. 


A tensor of rank 1 is a vector. Here, however, it’s important that the components of 
the vector transform as T; = R;jT}. If they don't transform in this way, then you don't 
have a tensor on your hands. You just have a bunch of numbers. 


A tensor of rank 2 is a matrix that transforms as Ti = Ru RjTy. Again, the trans- 
formation property is key. Just because you have an array of numbers A;;j, arranged 
in an n x n grid, doesn't mean that you have a 2-tensor. You have to check the trans- 
formation property holds. Otherwise, as with a vector, the array of numbers isn't a 
tensor; it's just a bunch of tensors. 


What's a Tensor and What's Not? 


It's worth elaborating on the definition of a tensor. For example, suppose that someone 
hands you a matrix, say 


3.8 0 
Ty =| 5—43 
113 


and asks you: “is this a tensor?". It's natural to answer yes. After all, it's written as 
T;; which is the name we've given to a tensor. And it looks for all the world like a 
matrix. So is it a tensor? The answer is: we don’t know. We haven't been given enough 
information. As we've stressed several times, a tensor isn't just a bunch of numbers 
arranged in some pattern. This sometimes goes by the name of an array of numbers. 
Instead, we only know that a given array of numbers is a tensor if it transforms as 
(6.3). That means that we need to firstly know what basis the array of numbers above 
has been measured in. And then we need to know what the array looks like when 
measured in other bases. Only then do we have enough information to say whether 
this is a tensor or not. It's a tensor only if transforms as (6.3): this transformation law 
is the definition of a tensor. 
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Here's another example. In a given basis, the position of a point is given by x;. We 
write this as the components of a vector 


'This is a tensor. Indeed, our starting point is that the components of this simple vector 
transforms in the tensorial way (6.2). This is just the statement that the components 
of this vector transform in the familiar way under rotation. 


Suppose that you now square each of these elements and decide to write them as a 
column vector. We'll give it a fancy name A;, complete with that hanging i index, 


That 4 index makes this look for all the world like it’s a tensor. But it’s not. We know 
that after a rotation, x; > v; = R,jx;. This means that if we do a rotation and then 
measure the components of the array A; we get 


(Rur + Rizy + Ri3z)? 
A, = | (Raw + Roy + R332)? 
(Rai + Foy + R332)? 


But that’s most definitely not how a tensor transforms! It’s not the rule (6.3) that we 
wanted. The upshot is that A; is not a tensor and it was a little bit naughty to write 
it as A; because it suggests that it has some property that it doesn’t. 


Why are we making such a big deal about this? What is so special about things 
that transform nicely as (6.3) under rotations? Well, there are several answers to this, 
depending on taste. At the most basic level, if you're a physicist, then you might 
genuinely want to know how something looks in different, rotated frames of reference. 


Moreover, once you realise that there's a preferred way for things to transform — 
the tensor way (6.3) — this brings some extra power to the calculations, a little like 
dimensional analysis. Suppose that you have an equation of the form “left-hand side" 
= “right-hand side". If the thing on the left is a tensor then the thing on the right 
better also be a tensor. And sometimes there's not many tensors available, which limits 
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your options for what the thing on the right can actually be. We'll see an example of 
this in Section 6.1.3 when we'll use tensors to make some scary looking integrals a little 
more palatable. 


The discussion above is very much from a physics perspective. But what about a 
pure maths perspective? This gives a more formal, but arguably cleaner, definition of 
a tensor. We'll explain this imminently in Section 6.1.1. 


We'll meet a number of tensors as we proceed. But there is a one that is special: 
this is the rank 2 tensor 6;; or, equivalently, the unit matrix. Importantly, it has the 
same 0 and 1 entries in any basis because, under the transformation (6.3), it becomes 


ó;; = Ra Rib = 5; 


We will devote Section 6.1.3 to “invariant tensors” which, like 6;;, take the same form 


ij 
in any basis. 


6.1.1 Tensors as Maps 


There is something a little strange about the definition of a tensor given above. We 
first pick a set of coordinates, and the transformation law (6.3) then requires that the 
tensor transforms nicely so that, ultimately, nothing depends on these coordinates. 
But, if that's the case, surely there should be a definition of a tensor that doesn't rely 
on coordinates at all! 


There is. A tensor T' of rank p is a multi-linear map that takes p vectors, a, b,...,c 
and spits out a number in R, 


T(a, b, eed et) = d goes Og 4s Cip (6.4) 
Here “multi-linear” means that T is linear in each of the entries a, b, . . . , c individually. 
By evaluating T on a all possible vectors a, b,...,c, we get the components Tiiz...ip 


The transformation rule (6.3) is simply the statement that the map T' is independent 
of the choice of basis, and we can equally well write 
A / yl / 
T(a, b, FE ;i) = deen do Ci, 


= (Rij Rings --- Ripjp Tuis as ) CF Uy) (Riga Dien) --- (RipkpChp ) 


=F gage lay Dig ccs Cip 


which follows because RTR = 1 or, in components, Rij Riz jj. The key is that this 
formula takes the same form in any basis. 
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Tensors as Maps Between Vectors 


Rather than thinking of a tensor as a map from many vectors to IR, you can equivalently 
think of it as a map from some lower-rank tensor to another. For example, in (6.4), 
if you don't fill in the first entry, then a rank p tensor can equally well be viewed as 
taking (p — 1) vectors and spitting out a single vector 


Qj = Tij, i0 Ca 


This is the way that tensors typically arise in physics or applied mathematics, where 
the most common example is simply a rank 2 tensor, defined as a map from one vector 
to another 


u=Tv > s Tiu; 


Until now, we've simply called T' a matrix but for the equation u — T'v to make sense, 
T must transform as a tensor (6.3). This is inherited from the transformation rules of 
the vectors, u, = Ru; and v; = Rijv;, giving 


u, = To, with Ti; = Rp Rtn 


2) J 


Written as a matrix equation, this is T" = RT RT. 


6.1.2 Tensor Operations 


Given a bunch of tensors, there are some manipulations that leave you with another 
tensor. Here we describe these operations. 


e We can add and subtract tensors of the same rank, so if S and T are both tensors 
of rank p then so too is S + T. We can also multiply a tensor by a constant a 
and it remains a tensor. 


e If S is a tensor of rank p and T a tensor of rank q, then the tensor product S &T 
is a tensor of rank p+ q, defined by 


(S GT bdo: = Dau, d hd 


You can check that the components of (SQT) do indeed satisfy the transformation 
rule (6.3). In particular, if we have p different vectors a, b, ..., c then we can 
construct a tensor 


T=a@b®@...®c with Ti,..4, = Giz bi, . . . Ci, 
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e Given a tensor T' of rank p, we can construct a new tensor S of rank (p — 2) by 
contracting on two indices using ij, 


S, kp a = Oijlijki. ko a 


For a rank 2 tensor, the contraction is what we call the trace, Tr T' = Ta. It's a 
valid tensor operation because the end result is a scalar that does not transform 
under rotations 


T; = RR = aT = Ta 


The same derivation shows that higher rank tensors can also be contracted, with 
the additional indices unaffected by the contraction. 


Combining a contraction with a tensor product gives a way to contract two 
different tensors together. For example, given a p-tensor P and q-tensor Q, we 
can form a p 4- q — 2 tensor by contracting, say, the first index on each to get 
Pág, x, aQin a, 4. This may sound abstract, but it's very much something you've 
seen before: given a pair of 1-tensors a and b, also known as vectors, we can 
combine them to get a 0-tensor, also known as a number 


a- b = ajb; 


This, of course, is just the inner-product. It is a useful operation precisely because 
the 0-tensor on the right-hand side is, like all 0-tensors, independent of the choice 
of basis that we choose to express the vectors. 


The Quotient Rule 


In practice, it's not hard to recognise a tensor when you see one. In any setting, they're 
usually just objects with a bunch of į and j indices, each of which clearly transforms as 
a vector. If in doubt, you can just check explicitly how the thing transforms. (There 
are cases where this check is needed. In later courses, you'll meet an object called the 
Levi-Civita connection I7, which looks for all the world like a tensor but turns out, on 
closer inspection, to be something more subtle.) 


There is a more formal way to say this. Let T7,..;,,, be a bunch of numbers that you 


think might comprise a tensor of rank p + q in some coordinate basis. If T; are 


1--iptq 
indeed the components of a tensor then you can feed it a rank q tensor uj,...;, and it 
will spit back a rank p tensor 


Ud — Taoheed d od. (6.5) 
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There is a converse to this statement. If for every tensor uj,.j,, the output vj, i, 
defined in (6.5) is a tensor, then Ti,.i,j..j,89re the components of a tensor. This is 
called the quotient rule. 


It is straightforward, if a little fiddly, to prove the quotient rule. It's sufficient to 
restrict attention to tensors u formed from the tensor product of vectors uj,,j, = 
Cj ...dj,. Then, by assumption, Vi..ip = T;,.4,j,.;,uj,..;, 1S a tensor. If we then con- 
tract with p further vectors a, ..., b then vj, i ai, ... 6i, = Tag apf eign Di Cj... dj, 
is necessarily a scalar. This is then enough to ensure the correct transformation rule 
(6.3) for the components T; 


1d pj. da 


Symmetry and Anti-Symmetry 


The symmetrisation properties of tensors are worthy of comment. A tensor that obeys 
Tijp...g = Xj. 


is said to be symmetric (for +) or anti-symmetric (for —) in the indices i and j. If a 
tensor is (anti)-symmetric in one coordinate system then it is (anti)-symmetric in any 
coordinate system 


T; RikrRjRpr E RgsTkir...s — SE Do oes tee RgsTikr...s c 


ijp.q — jip...q 


A tensor that is (anti)-symmetric in all pairs of indices is said to be totally (anti)- 
symmetric. Note that for tensors in IR", there are no anti-symmetric tensors of rank 


p > n because at least one of the indices must take the same value and so the tensor nec- 
n 


9) independent 


essarily vanishes. A totally anti-symmetric tensor of rank p in R” has ( 
components. 


Let’s now restrict our attention to R3. A tensor of rank 2 is our new fancy name 
for a 3 x 3 matrix T;j. In general, it has 9 independent components. We can always 
decompose it into the symmetric and anti-symmetric pieces 


1 1 
Sij = 54 tTa) and Ay —5(5-Tj) 
which have 6 and 3 independent components respectively. Our discussion above shows 
that S and A are each, themselves, tensors. In fact, the symmetric piece can be 


decomposed further, 
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where Q = Sj; is the trace of 5 and carries a single degree of freedom, while P;; is the 
traceless part of S and carries 5. The importance of this decomposition is that A, P 
and Q are individually tensors. In contrast, if you were to take, say, the upper-left-hand 
component of the original matrix T;; then that doesn't form a tensor. 


In R?, we can also rewrite an anti-symmetric matrix in terms of a vector, 


1 
Aij = Eijk Bk — Bk = gar Au 


The upshot is that in any 3 x 3 matrix can be decomposed as 
1 
Ti; = Pij + cij Bk + 39u9 (6.6) 
where P; = 0. 


6.1.3 Invariant Tensors 


There are two important invariant tensors in IR". 


e We've met the first already: it is the rank 2 tensor à;j. As we noted previously, 
this is invariant because 


ó;; = Ri Rib = dij 
Note that ó;; is invariant under any R € O(n). 


e The rank n totally anti-symmetric tensor ej, ;,. This is defined by €12..n = +1. 
If you swap any two indices you get a minus sign. In particular, if any two indices 
are repeated, the epsilon symbol vanishes. This is invariant because 


1 E — =, 
€ = Rij Levis et eee = det R €ji...in = [22000 


41. dn 


Note that the epsilon symbol is only invariant under R € SO(n) but it is not 
invariant under R € O(n) with det R = —1. It picks up a minus sign under 
reflections. The invariance of e;j;, in R? is the reason why the cross-product 
(a x b); = €;;,a;b, is itself a vector. Or, said differently, why the triple product 
a-(b x c) = eij,ajbjc; is independent of the choice of basis. 


In general, a tensor is said to be invariant under a given rotation Rif 


T' 


Tdi 


1n 


A tensor that is invariant under all rotations R is said to be isotropic. Obviously all 
tensors of rank 0 are isotropic. What about higher rank tensors? 
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Claim: The only non-zero isotropic tensors in IR? of rank p — 1,2 or 3 are Tij 55005 
and Tijk = Pej, with a and 8 constant. In particular, there are no isotropic tensors of 
rank 1 (essentially because a vector always points in a preferred direction). 


Proof: The idea is simply to look at how tensors transform under a bunch of spe- 
cific rotations by 7 or 7/2 about certain axes. 


For example, consider a tensor of rank 1, so that 


-10 0 
T! —RjT; with Rgj-| 0-10 (6.7) 
0 0 +1 


Requiring 77 = T; gives Tj = T3 = 0. Clearly a similar argument, using a different R, 
also gives T3 = 0. 


For a tensor of rank 2, consider the transformation 


010 
T;-RaRaTa with Rj-|-10 0 (6.8) 
0 041 


which is a rotation by 7/2 about the z-axis. The rotation gives 714 = 753 and Tj, = 
—T\3 so if Ti; = Ti; we must have Tis = Tos = 0. Meanwhile 77, = T5». Similar 
arguments tell us that all off-diagonal elements must vanish and all diagonal elements 
must be equal: Tij = T5» = T33 = a for some a. Hence Tij = aói;. 


Finally, for a rank 3 tensor we have 


T = RR jpRrqlipg 

If we pick R given in (6.7), then we find Tiss = —Ti33 and Thi = —Tiu. Similar 
arguments show that an isotropic tensor must have T;;; = 0 unless 7, j and k are all 
distinct. Meanwhile, if we pick R = R given in (6.8), then we get Tiss = —Thi3. We 
end up with the result we wanted: Tj,;;, is isotropic if and only if Tijk = Be;j; for some 


constant 6. 
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Although we won't prove it here, all other isotropic tensors can be formed from 5j; 
and €,;,. For example, the only isotropic 4-tensor in R? is 


Tijg = a0;jÓyi + DO; Óji + YOud jk 


with a, 6 and y constants. You could try to cook up something involving cjj; but it 
doesn’t give anything new. In particular, €ijk€ilp = ÔjtÔkp — Ojpdk- 


There is also an analogous result in R”: all isotropic tensors can be constructed from 
the symmetric 2-tensor 0;; and the totally anti-symmetric n-tensor ej, j,. 


Invariant Integrals 


It is sometimes possible to use invariance properties to immediately write down the 
index structure of an integral, without doing the hard work of evaluating everything 
term by term. Suppose that we have some integral of the form 


Tij..k — f fo jaa e Xk dV 
V 
with r = |x|. Then under a rotation, we have 
ijk = RipRjq.-- appe. = [ terete ...0,dV 


with, as usual, z; = Rjjr;. But if we now change the integration variables to 2’, 
both r = |x| = |x’| and dV = dV’ are invariant. (The latter because the Jacobian is 
detR = 1). If the domain of integration is also rotationally invariant, so V = V’, then 
the final result must itself be an invariant tensor, T7; p = T;;.. 


As an example, consider the following 3d integral over the interior of a sphere of 
radius R 


Tij = | sav (6.9) 


(In Section 6.2, we will find integrals of this form arising when we compute the inertia 
tensor of a sphere.) By the argument above 7;; must be an isotropic tensor and hence 
proportional to ô 


ij 
Ji = f flr aes dV = Qi; 
V 


for some a. If we take the trace, we get 


f p(r)r^ dV = 3a 
V 


edi 


Hence, 


1 


Ti = yos f or? dV — 


An 


R 
To f dr p(r)r* (6.10) 
3 ' Jo 


For example, if p(r) = po is constant, then Ti; = poR?0ij 


6.1.4 Tensor Fields 


A tensor field over IR? is the assignment of a tensor T; (x) to every point x € R3. 
'This is the generalisation of a vector field 


F : R? > R? 
to a map of the kind 
T : R? > R” 


with m the number of components of the tensor. So, for example, a map that assigns 
a symmetric, traceless rank 2 tensor P;;(x) to every point has m = 5. 


The tensor field 7; (x) is sometimes denoted as T; (x!) which is supposed to show 
that the field depends on all coordinates z',..., 2°. It's not great notation because the 
indices as subscripts are supposed to take some definite values, while the index / in the 
argument is supposed to denote the whole set of indices. It's especially bad notation 
when combined with the summation convention and we won't adopt it here. 


There is one very famous example of a tensor field. Einstein's theory of general rela- 
tivity is described by a rank 2 tensor at every point in space. This is called the metric. 
The dynamics of this rank 2 tensor field describe gravity. (I’ve brushed something 
rather important under the rug here. Einstein's theory is a rank 2 tensor in spacetime, 
not just in space. Which means that the rank 2 tensor is a 4 x 4 matrix, rather than a 
3 x 3 matrix.) 


Before we move on, it's worth pausing to mention a slightly subtle point. Not all 
maps R? — R? qualify as “vector fields". The point x in the codomain R? is a vector 
and so its components transform in the appropriate way under rotation. To be a vector 
field, the components of the map must transform under the same rotation. Similar 
comments hold for a tensor field. 


To illustrate this, the electric field E(x) is an example of a vector field. If you rotate 
in space, and so change x, then the direction E also changes: the rotation acts on both 
the argument x and the function itself E. 
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In contrast, there are maps IR? — IR? where, although the domain and codomain have 
the same dimension, vectors in them transform under different rotations. For example, 
in particle physics there exists an object called a quark field which, for our (admittedly, 
slightly dumbed down) purposes, can be thought of as a map IR? — R3. This is a 
quantum field whose ripples are the particles that we call quarks, but these details can 
be safely ignored for the next couple of years of your life. We will write this field as 
da(x) where the a = 1,2,3 label is the “colour” of the quark. If we rotate in space, 
then x changes but the colour of the quark does not. There is then an independent 
rotation that acts on the codomain and rotates the colour, but leaves the point in space 
unchanged. Because the rotations that act on the domain and codomain are unrelated, 
the quark field is usually not referred to as a vector field. 


Taking Derivatives 


Given a tensor field, we can always construct higher rank tensors by taking derivatives. 
In fact, we've already seen a prominent example of this earlier in these lectures. There, 
we started with a scalar field ¢(x) and differentiated to get the gradient V. This 
means that we start with a rank 0 tensor and differentiate to get a rank 1 tensor. 


Strictly speaking, we didn't previously prove that V4 is a vector field. But it's 
straightforward to do so. As we've seen above, we need to show that it transforms 
correctly under rotations. Any vector v can be decomposed in two different ways, 


j Fint 
v — v'ej — v 'e; 


where {e;} and {e;} are two orthonormal bases, each obeying e;- e; = e; : e; = ĝ;j, and 


ijs 
v! and v'* are the two different coordinates for v. If we expand x in this way 


— S yh! — ; = (ei- e) = On - "m 
X = VC, = T,e€; Li = (ej ej Tj Bh =e; ej 
J 


Here e; e; is the rotation matrix that takes us from one basis to the other. Meanwhile, 
we can always expand one set of basis vectors in terms of the other, 


ðr’ 
e; = (ei- e;)e; = gi’ 
This tells us that we could equally as well write the gradient as 
vba Be, Ob Ov o 06 


Or! ^  OxiQa') 9) Oxi 3 
This is the expected result: if you work in a different primed basis, then you have the 
same definition of V@, but just with primes on both e; and 0/0z'*. This means that 
the components 0O;¢ transform correctly under a rotation, so Và is indeed a vector. 
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We can extend the result above to any, suitably smooth, tensor field T(x) of rank 
p. We can differentiate this any number of times to get a new tensor field of rank, say, 


p q, 
o o 


A riea = Oni, ee Oxi, jijo (x) (6.11) 
To verify that this is indeed a tensor, we need to check how it changes under a rotation. 


In a new basis, we have z; = R;;r; (where R;; = e€; - e; in the notation above) and so 


Or; p. EN ð 05 0 , 0 
Oz; Oz, Ox! Ox; "Oa; 


which is the result we need for X in (6.11) to qualify as a tensor field. 


We can implement any of the tensorial manipulations that we met previously for for 
tensor fields. For example, if we start with a vector field F(x), we can form a rank 2 
tensor field 


OF; 
ux) = z— 
7 Ox; 


But we saw in (6.6) that any rank 2 tensor field can be decomposed into various pieces. 
There is an anti-symmetric piece 


OF, 1 


. 1 
Aij(X) = ejkBk(x) with By — 3g, 5 (V x E), 
and a trace piece 
OF; 
G= — 278 
Ox; 


and, finally, a symmetric, traceless piece 


1/0F, OFjN 1 
E = = .F 


Obviously, the first two of these are familiar tensors (in this case a scalar and vector) 
from earlier sections. 


6.2 Physical Examples 


Our discussion above was rooted firmly in mathematics. There are many places in 
physics where tensors appear. Here we give a handful of examples. 
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6.2.1 Electric Fields in Matter 
Apply an electric field E to a lump of stuff. A number of things can happen. 


If the lump of stuff is an insulator then the material will become polarised. This 
means that the positive electric charge will be pushed in one direction, the negative 
in another until the lump of stuff acts like a dipole. (This is described in some detail 
in Section 7 of the lectures on Electromagnetism.) One might think that the resulting 
polarisation vector P points in the same direction as the electric field E, but that’s too 
simplistic. For many lumps of stuff, the underlying crystal structure allows the electric 
charges to shift more freely in some directions than others. The upshot is that the 
relation between polarisation P and applied electric field E is given by 


P=ak 


where a is a matrix known as the polarisation tensor. In a given basis, it has compo- 
nents Qij. 


There is a similar story if the lump of stuff is a conductor. This time an applied 
electric field gives rise to a current density J. Again, the current is not necessarily 
parallel to the electric field. The relationship between them is now 


J =oE 


This is known as Ohm’s law. In general o is a 3 x 3 matrix known as the conductivity 
tensor; in a given basis, it has components o;z. 


What can we say about c when the material is isotropic, meaning that it looks the 
same in all directions? In this case, no direction is any different from any other. With 
no preferred direction, the conductivity tensor must be proportional to an invariant 
tensor, so that it looks the same in all coordinate systems. What are our options? 


For 3d materials, the only option is oj; = o0;;, which ensures that the current 
does indeed run parallel to the electric field. In this case c is just referred to as the 
conductivity. 


However, suppose that we're dealing with a thin wafer of material in which both the 
current and electric field are restricted to lie in a plane. This changes the story because 
now we're dealing with vectors in R? rather than R3 and R° is special because there 
are two invariant 2-tensors in this dimension: ó;; and e;j. This means that the most 
general conductivity tensor for an isotropic 2d material takes the form 


Dij = Osa0ij + Oay6ij = 


— Oxy Oa 
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Here oz, is called the longitudinal conductivity while oz, is called the Hall conductivity. 
If ozy 7: 0 then an electric field in the x-direction induces a current in the y-direction. 


As an aside, it turn out that the seemingly mundane question of understanding 
Oxy in real materials is closely tied to some of the most interesting breakthroughs in 
mathematics in recent decades! This is the subject of the Quantum Hall Effect. 


6.2.2 The Inertia Tensor 


Another simple example of a tensor arises in Newtonian mechanics. A rigid body ro- 
tating about the origin can be modelled by some number of masses m, at positions Xa, 
all moving with velocity x, = w x Xa. Here w is known as the angular velocity. The 
angular velocity w is related to the angular momentum L by 


L= Iw (6.12) 


with / the inertia tensor. The angular momentum does not necessarily lie parallel to 
the angular velocity and, correspondingly, J is in general a matrix, rather than a single 
number. In fact, we can easily derive an expression for the inertia tensor. The angular 
momentum is 


L = Y m X Xa = 3 gg x (wW X Xa) = Yom. ( Peale — (Xa: w)xa) 
In components, L; = I;jw;, where 
Ij = Yom (Ix E (%0)s(%a)s) 


For a continuous object with density p(x), we can replace the sum with a volume 
integral 


ie | p(x) (Ix; — 2:2) av (6.13) 


So, for example, I33 = f p(x? + 23) dV and I; = f pziza dV. 


An Example: A Sphere 


For a ball of radius R and density p(r), the inertia tensor is 


Ig = i p(r)(r°6i3 — zix;) dV 
V 
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The second of these terms is the integral (6.9) that we simplified in Section 6.1.3 using 
isotropy arguments. Using (6.10), we have 


2 R 
lu = z% nos dV — Foo f dr p(r)r* 


For example, if p(r) = po is constant, then [;; = ET py Ri; = 2M R6i; where M is 
the mass of the sphere. 


Another Example: A Cylinder 


The sphere is rather special because the inertia tensor is proportional to 6,;. That’s 
not the case more generally. Consider, for example, a solid 3d cylinder of radius a and 
height 2L, with uniform density p. The mass is M = 2za?Lp. We align the cylinder 
with the z-axis and work in cylindrical polar coordinates x = r cos and y = rsin œ. 
The components of the inertia tensor are then 


20 a +L 
I33 = f p(a? -- 3?) dV = Zi ae | ar f dz rr? — pn La* 
V 0 0 -L 
2m a +L a? 212 
hi = f p(y? + 27) dV = Zi ae | ar f dz r(r? sin? + 2°) = pra? L (5 4 5 
V 0 0 I: 2 3 


By symmetry, [92 = lı. For the off-diagonal elements, we have 


27 a L 
hs=- | prissdv =- | d | ar f dz r?zcos¢ = 0 
V 0 0 -L 


where the integral vanishes due to the ¢ integration. Similarly, J19 = 43 = 0. We find 
that the inertia tensor for the cylinder is 


TN a? rP 1 
I-di M | — + — M Ma? 6.14 
diag ( (i42) (S+5) me) ( ) 


Note that the inertia tensor is diagonal in our chosen coordinates. 


The Eigenvectors of the Inertia Tensor 


The inertia tensor J defined in (6.13) has a special property: it is symmetric 
I = Iji 


Any symmetric matrix / can always be diagonalised by an appropriate rotation. This 
means that there exists an R € SO(n) such that 


I' = RIR" = diag(hy, I», Is) 


Another way of saying this is that any symmetric rank 2 tensor has a basis of orthonor- 
mal eigenvectors {e;}, with J; the corresponding eigenvalues. 
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In the case of the inertia tensor, the eigenvectors ei, e3 and es are called the principal 
axes of the solid. It means that any object, no matter how complicated, has its own 
preferred set of orthonormal axes embedded within it. If the object has some symmetry, 
then the principal axes will always be aligned with this symmetry. This, for example, 
was the case for the cylinder that we computed above where aligning the cylinder with 
the z-axis automatically gave us a diagonal inertia tensor (6.14). 


In general, it will be less obvious where the principal 
axes lie. For example, the figure on the right shows the 
asteroid Toutatis, which is notable for its lumpy shape. 
The principal axes are shown embedded in the asteroid. 


From (6.12), the angular momentum L is aligned 
with the angular velocity w only if a body spins about 


one of its principal axes. It turns out that, in this 
case, nice things happen and the body spins smoothly. 
However, if L and w are misaligned, the body exhibits more complicated tumbling, 
wobbling motion as it spins. You can learn all about this in the lectures on Classical 
Dynamics. (For what it’s worth, Toutatis does not spin about a principal axes.) 


6.2.3 Higher Rank Tensors 


You might reasonably complain that, after all that work defining tensors, the examples 
that we’ve given here are nothing more exotic than matrices, mapping one vector to 
another. And you would be right. However, as we get to more sophisticated theories of 
physics, tensors of higher rank do make an appearance. Here we don’t give full details, 
but just say a few words to give you a flavour of things to come. 


Perhaps the simplest example arises in the theory of elastic materials. These mate- 
rials can be subjected to strain, which describes the displacement of the material at 
each point, and stress, which describes the forces acting on the material at each point. 
But each of these is itself a 2-tensor (strictly a tensor field). The strain tensor e;; is 
a symmetric tensor that describes the way the displacement in the x’ direction varies 
in the a’. The stress tensor oj; describes the component of the force F; across a plane 
normal to xf. These two tensors are related by 


Ojj = CijklCkl 


This is the grown up version of Hooke's law. In general an elastic material is charac- 
terised by the elasticity tensor, also known as the stiffness tensor, Cy. 
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Higher rank tensors also appear prominently in more advanced descriptions of ge- 
ometry. In higher dimensions, the simple Gaussian curvature that we met in Section 2 
isn't enough to capture all the interesting ways in which spaces can curve in different 
directions. Instead, it is replaced by a 4-tensor Rijxı known as the Riemann curvature. 
In the context of physics, this 4-tensor describes the bending of space and time and is 
needed for the grown-up version of Newton's law of gravity. 


6.3 A Unification of Integration Theorems 


In this final section, we turn back to matters of mathematics. The three integral 
theorems that we met in Section 4 are obviously closely related. To end these lectures, 
we show how they can be presented in a unified framework. This requires us to introduce 
some novel and slightly formal ideas. These go quite a bit beyond what is usually 
covered in an introductory course on vector calculus, but we will meet these objects 
again in later courses on Differential Geometry and General Relativity. View this 
section as a taste of things to come. 


6.3.1 Integrating in Higher Dimensions 


Our unified framework will give us integral theorems in any dimension R”. If you look 
back at Section 4, you'll notice that the divergence theorem already holds in any IR". 
Meanwhile, Stokes’ theorem is restricted to surfaces in IR? for the very simple reason 
that the cross-product is only defined in IR?. This suggests that before we can extend 
our integral theorems to higher dimensions, we should first ask a more basic question: 
how do we extend the cross product to higher dimensions? 


The introduction of tensors gives us a way to do this. Given two vectors a and b in 
IR?, the cross-product is 


(a x b); = Eijkajbk 


From this perspective, the reason that the cross product can only be employed in IR? 
is because it’s only there that the e;;; symbol has three entries. If, in contrast, we're in 
R^ then we have Eijk] and so if we feed it two vectors a and b, then we find ourselves 
with a tensor of rank 2, Tij = €ijklakbi. 


The tensors that we get from an epsilon symbol are always special, in the sense 
that they are totally anti-symmetric. The anti-symmetry condition doesn't impose any 
extra constraint on a 0-tensor ¢ or a 1-tensor a; as these are just scalar fields and vector 
fields respectively. It only kicks in when we get to tensors of rank 2 or higher. 


- 125- 


With this in mind, we can revisit the cross product. We can define the cross product 
in any dimension R”: it is a map that eats two vectors a and b and spits back an 
anti-symmetric (n — 2)-tensor 


(a x Djg d m Eod 0s 


The only thing that's special about IR? is that we get back another vector, rather than 
a higher dimensional tensor. 


There is also a slightly different role played by the epsilon symbol ej, ;,: it provides 


a map from anti-symmetric p-tensors to anti-symmetric (n — p)-tensors, simply by 
contracting indices, 

€: dius > (n — py acie Peces (6.15) 
This map goes by the fancy name of the Hodge dual. (Actually, it's an entirely trivial 
version of the Hodge dual. The proper Hodge dual is a generalisation of this idea to 


curved spaces.) 


Our next step is to think about what this has to do with integration. Recall that 
earlier in these lectures we found two natural ways to integrate vector fields in R3. The 
first is along a line 


[ Pd (6.16) 


which captures the component vector field tangent to the line. We can perform this 
procedure in any dimension IR”. The second operation is to integrate a vector field over 
a surface 


f F. dS (6.17) 
S 


where dS points in the direction normal to the surface. This integration captures the 
component of the vector field normal to the surface and only makes sense in R3. This 
is because it's only in IR? that a two-dimensional surface has a unique normal. More 
operationally, this normal, which is buried in the definition of dS, requires us to use 
the cross product. For a parameterised surface x(u, v), the vector area element is 

Ox Ox 


dS = 2 Xo, dua 
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or, in components, 


Ox! Ox* 


irc 


du dv 

Now comes a mathematical sleight of hand. Rather than thinking of (6.17) as the 
integral of a vector field projected normal to the surface, instead think of it as the 
integral of an anti-symmetric 2-tensor Fij = e;j,F; integrated tangent to the surface. 
We then have 


1/0r 0r" = Ax) Ax" 
des | ae. with = = 1 
n dS L5 dS; wit dS; 5 (F DY E? 5 )^ dv (6.18) 


This is the same equation as before, just with the epsilon symbol viewed as part of 
the integrand F;; rather than as part of the measure dS;. Note that we've retained the 
anti-symmetry of the area element dS;; that was inherent in our original cross product 
definition of dS. Strictly speaking this isn't necessary because we're contracting with 
anti-symmetric indices in F;;, but it turns out that it's best to think of both objects 
F;; and dSj; as individually anti-symmetric. 


This new perspective suggests a way to generalise to higher dimensions. In the line 
integral (6.16) we're integrating a vector field over a line. In the surface integral (6.18), 
we're really integrating an anti-symmetric 2-tensor over a surface. The key idea is that 
one can integrate a totally anti-symmetric p-tensor over a p-dimensional subspace. 


Specifically, given an anti-symmetric p-tensor, the generalisation of the line integral 
(6.16) is the integration over a p-dimensional subspace, 


I Tirip 096.4, (6.19) 
M 


where dim(M) = p. Here dS; 
defined in (6.18). 


,..ép 18 a higher dimensional version of the “area element” 


Alternatively, the higher dimensional version of the surface integral (6.17) involves 
first mapping the p-tensor to an (n — p)-tensor using the Hodge dual. This can subse- 
quently be integrated over an (n — p)-dimensional subspace 


T ee ace ee ee dS, inp (6.20) 
M 


with dim(M) = n — p. 
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In fact, we've already met an integral of the form (6.20) elsewhere in these lectures, 
since this is what we're implicitly doing when we integrate a scalar field over a volume. 
In this case the “area element" is just d S;,. ;, = eji dV and the two epsilon symbols 
just multiply to a constant.. When actually computing a volume integral, this extra 
machinery is more of a distraction than a help.. But if we want to know how to think 
about things more generally then it's extremely useful. 


6.3.2 Differentiating Anti-Symmetric Tensors 


We've now learned how to integrate anti-symmetric tensors. Our next step is to learn 
how to differentiate them. We've already noted in (6.11) that we can differentiate a p 
tensor once to get a tensor of rank p 4- 1, but in general differentiating loses the anti- 
symmetry property. As we now explain, there is a way to restore it so that when we 
differentiate a totally anti-symmetric p tensor, we end up with a totally anti-symmetric 
(p + 1)-tensor. 


For a scalar field, things are trivial. We can construct a vector field V@ and this is 
automatically “anti-symmetric” because there's nothing to anti-symmetrise. 


If we're given a vector field F;, we can differentiate and then anti-symmetrise by 
hand. I will introduce a new symbol for “differentiation and anti-symmetrisation” and 
write 


c bf Or... OF; 
PE = 2 (= i 7J 
where the anti-symmetry is manifest on the right-hand side. I should confess that the 
notation DF is not at all standard. In subsequent courses, this object is usually viewed 
as something called a “differential form" and written simply as dF but the notation 
dF is loaded with all sorts of other connotations which are best ignored at this stage. 
Hence the made-up notation DF. 


In R, this antisymmetric differentiation is equivalent to the curl using the Hodge 
map (6.15), 


(V x F); = Eijk DF) jk 


But now we can extend this definition to any anti-symmetric p-tensor. We can always 
differentiate and anti-symmetrise to get a (p + 1)-tensor defined by 


p 1 Oa» 1 


where the further terms involve replacing the derivative 0/Ox'?+* with one of the other 
coordinates 0/02) so that the whole shebang is fully anti-symmetric. 


eas 


Note that, with this definition of D, if we differentiate twice then we take a p-tensor 
to a (p + 2)-tensor. But this (p + 2)-tensor always vanishes! 


(DDT)4 i,,, =O 
for any tensor T. This is because we'll have two derivatives contracted with an epsilon 
and is the higher dimensional generalisation of the statements that V x Vọ = 0 or 


V-(VxF)=0. 


As an aside: this is actually the second time in these lectures that we've seen some- 
thing vanish when you act twice, although you'd be forgiven for failing to notice the 
connection. Here our new anti-symmetric derivative obeys D?(anything) = 0. But we 
previously saw that the “boundary of a boundary" is always zero. This means that if 
a higher dimensional space (really a manifold) M has boundary 0M then 0(0M) = 0. 
Conceptually, these two ideas are very different but one can't help but be struck by 
the similarity of the equations D?(anything) — 0 and O?(anything) — 0, even though 


999 


the “anything”’s are very different objects in the two formulae. It turns out that this 
similarity is pointing at a deep connection between the topology of spaces and the 
kinds of tensors that one can put on these spaces. In fancy maths words, this is the 


link between homology and cohomology. 


Finally, we can now state the general integration theorem. Given an anti-symmetric 
p-tensor T', then 


f (DT ids dS5, isis m f duds d5i, ip (6.21) 
M OM 


Here dim(M) = p+ 1 and, therefore the boundary has dim(0M) = p. Note that we 
don’t use a different letter to distinguish the integration measure over these various 
spaces: everything is simply dS and you have to look closer at the indices to see what 
kind of space you’re integrating over. 


The equation (6.21) is a unification of all integration theorems. It contains the 
fundamental theorem of calculus (when p = 0), the divergence theorem (when p = n—1) 
and Stokes’ theorem (when p = 1 and R” = R°). Geometers refer to this generalised 
theorem simply as Stokes’ theorem since that is the original result that it resembles 
most. The proof is simply a higher dimensional version of the proofs that we sketched 
previously. 
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There is, to put it mildly, quite a lot that I’m sweeping under the rug in the discussion 
above. In particular, the full Stokes’ theorem does not hold only in R” but in a general 
curved space known as a manifold. In that context, one has to be a lot more careful 
about what kind of tensors we're dealing with and, as I mentioned above, Stokes' 
theorem should be written using a kind of anti-symmetric tensor known as a differential 
form. None of this really matters when working in flat space, but the differences become 
crucial when thinking about curved spaces. If you want to learn more, these topics will 
be covered in glorious detail in later courses on Differential Geometry or, for physicists, 
General Relativity. 
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What You Really Need 


Here are expressions for div, grad, curl and the Laplacian in various coordinate systems. 


Cartesian: x = (x,y,z) 


a 
rr ee 

ver (FS) (9-5) a 
VIR oe RÀ 


Cylindrical Polars: x = (pcos à, psin 4, z) 


BE v TOf of, 
E Bp? i 29187 z^ 


Vf 


_ 10(pF,) | 10Fy | OF. 
p Op põ dz 


V-F 


_ (10F, OF,V, (OF, OF. , 1 (dps) OF. 
ES p Joss 3 )**.l ap dp)" 


5, 10 ( 0fN 10f Pf 
VT — $85 0p] * Poe | Oe 
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Spherical Polars: x = (r sin 6 cos o, r sin 0 sin ¢, r cos 0) 


PIPER l1ðf; 1 of 
ôr | rO0 ` rsin0 OQ 


Vf- 


^ 


$ 


V.F- 10(r°F,) | 1 O(sinOFy) | 1 OF, 
rr? Ór rsinü 90 ` rsin O¢ 
V T u 1 O(sin 0F;) E OF, us 1 1 OF, Á O(rF;) 0 i 1 O(r Fo) oF, 
—. rsin 90 ero ^ r \sin@ OQ Or m Or 90 
geo 18 ff) 1 əf. f) p æf 
v= r2ðr dr)" r?sin000 sin 056  r?sin?0 09? 
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